It's been more than two months since I completed the first version of m hand gesture recognition system but I didn't update it on the blog because I got busy with placement season at the college and then the mid-semester exams.
In my last post I tried background subtraction but I could not successfully resolve the problems caused by it i.e. : Background noise,inconsistent convex hull and missing frames.
So, I decided to go with something simpler first like static image filtering which I found here.
I followed the following steps:
- Pre process the video frame by converting it to HSV color space.
- Change the Hue , Saturation and Value variables to segment the hand.
- Post process the frame to remove background noise by doing a bitwise AND of consecutive frames and using erode and dilate functions.
- Find the contours and convex hull of the hand segment.
In order to identify the gestures to play Pacman, I divided the screen into 9 parts and assigned 1 part to each gesture i.e. LEFT,RIGHT,UP,DOWN and left the rest of them as blank spaces as shown in the figure below.
Then I used the top most point of the convex hull having the largest area(let's call this point T) as a pointer to identify the gesture. If T is in the LEFT region the pacman moves left , if T is in TOP region the pacman moves up and so on.
I mapped the point T with the respective keys on the keyboard and it worked quite well and the game was playable as is shown in the video below.
I mapped the point T with the respective keys on the keyboard and it worked quite well and the game was playable as is shown in the video below.
I realize that this method is a little simplistic but it is good as a starting point for further improvements.
In order to make further improvements in this model or propose a new model, I first need to read some more about existing methods.
These would be a good starting point: