Saturday 17 October 2015

Hand Gesture Recognition Update - 1


It's been more than two months since I completed the first version of m hand gesture recognition system but I didn't update it on the blog because I got busy with placement season at the college and then the mid-semester exams.

In my last post I tried background subtraction but I could not successfully resolve the problems caused by it i.e. : Background noise,inconsistent convex hull and missing frames.
So, I decided to go with something simpler first like static image filtering which I found here.
I followed the following steps:

  1. Pre process the video frame by converting it to HSV color space.
  2. Change the Hue , Saturation and Value variables to segment the hand.
  3. Post process the frame to remove background noise by doing a bitwise AND of consecutive frames and using erode and dilate functions.
  4. Find the contours and convex hull of the hand segment.
In order to identify the gestures to play Pacman, I divided the screen into 9 parts and assigned 1 part to each gesture i.e. LEFT,RIGHT,UP,DOWN and left the rest of them as blank spaces as shown in the figure below.


Then I used the top most point of the convex hull having the largest area(let's call this point T) as a pointer to identify the gesture. If T is in the LEFT region the pacman moves left , if T is in TOP region the pacman moves up and so on.
I mapped the point T with the respective keys on the keyboard and it worked quite well and the game was playable as is shown in the video below.






I realize that this method is a little simplistic but it is good as a starting point for further improvements.
In order to make further improvements in this model or propose a new model,  I first need to read some more about existing methods.
These would be a good starting point:

Thursday 16 July 2015

Pacman agent using Minimax and AlphaBeta


Last week I worked on creating a Pacman agent using basic Artificial Intelligence techniques.

I started with watching the first 7 lectures of the MIT 6.034F Artificial Intelligence OCW. It included topics like DFS,BFS,Hill Climbing,Branch and Bound,Minimax ,Alpha-Beta Pruning and A* search.
Now, I wanted to put my newly learnt knowledge to use so I searched a bit online and tumbled upon  the Stanford's Pacman assignment which seemed really interesting.

In this assignment I was given the code for the pacman game and I had to implement the Pacman agent to win the game and score as much points as possible.
So, I implemented minimax algorithm with alpha-beta pruning and currentScore as the evaluation function which gave better results than the default reflex agent.This pacman  agent was able to win almost 5/10 times in mediumClassic with random ghosts and depth=5. As can be seen in the video below, with currentScore as evaluation function, the pacman gets stuck in places where it has no food around it and it cannot decide where to go as it has no sense of where the food/capsules are, So , although it manages to not die, it keeps getting stuck. A better evaluation function should solve this problem.

A better evaluation function would be to get a linear combination of minimum distance to food,no of food left,no of capsules left, minimum distance from ghosts etc.
But the coefficients of these parameters which indicate the weight of the parameters with respect to each other need to be decided. For my evaluation function I arrived at the coefficients with my intuition and a little bit of trial and error. Although i came upon this blog where someone had used linear regression to arrive at the correct value of the coefficients which I think is awesome and would like to explore some other time.
The new evaluation function gives much better results as can be seen in the video below. It does not get stuck in empty spaces or near uneaten food. With this evaluation function the pacman agent was able to win almost 8/10 times.
I won't post the code for what I wrote because this assignment is still used in many colleges and it would be unethical to post the solution online.(and because arriving at the solution yourself and seeing your pacman kick ass of those ghosts is fun too).
The pacman simulations are shown in the video below:



Future work(for me) for pacman:
1)ML(linear regression,SVM) to arrive at the coefficients.
2)Reinforcement learning.
3)Genetic algorithms.
4)Learning to Play Pac-Man: An Evolutionary, Rule-based Approach
5)Learning to Play Using Low-Complexity Rule-Based Policies:Illustrations through Ms. Pac-Man




Wednesday 1 July 2015

Getting started with OpenCV and Hand Gesture Recognition


For the past few days I have been working on OpenCV and trying to build a hand gesture recognition system.
Even though it is far from complete and needs many more improvements and I wanted to write about the progress that I have made.
So, I have detailed the steps that I followed and the links that helped me.

Setting up the environment:

  1. Install and configure OpenCV: I used the first method and installed it using the pre-built libraries in C:\
  2. Install Visual Studio 2013.
  3. Open Visual Studio and start a new project. Choose Win32 console application and name it appropraitely.(Yes, even for 64 bit applications. Apparently 32 in Win32 does not refer to it's bit-ness.for more details refer this.)
  4. In the create project wizard , select empty project in the second window of the wizard and keep everything else as it is.
  5. Right click on project name and open properties->configuration->manager->platform->new->x64->ok->close(this will make the application run in 64 bit systems)
  6.  Including C/C++ opencv libraries:
    C/C++ -> General->Additional Include Directories ->edit -> double click empty area -> click .. -> browse to C:\opencv\build\include
    similarly include:
    C:\opencv\build\include\opencv
    and
    C:\opencv\build\include\opencv2
    Click OK
  7. Including Linker Libraries:
    Linker->General->Additional Library Directories->edit->browse and select C:\opencv\build\x64\lib
    Linker->Input->Additional Dependencies->edit->copy the names of files with d in the end. in m case it was:
    opencv_ts300d.lib
    opencv_world300d.lib
    Click OK
  8. Add OpenCV path to system environment: C:\opencv\build\x64\vc12\bin
  9. Right Click Source files under the project->name->add->new item->C++ File
Most of the steps I followed are same as this excellent youtube video so if you do not understand any of the above steps, you can refer to the linked video.

Where to start?

For getting started with OpenCV I followed some of the online tutorials to understand how the library works and what are the common function and how to read images and videos and webcam(whixh is also a video btw).
These are some of the links that helped me with it:
https://www.youtube.com/user/MsDarkCello/videos
The official OpenCV tutorial seems pretty good and whenever I want to see how a function is used I look at their sample code from this tutorial. Unfortunately I got to know about it pretty late so, couldn't do it from start to finish but it is definitely on my to-do list.

For the actual hand gesture recognition, we first require to segment our target i.e. separate the hand from the rest of the frame, this is called image segmentation.The three common techniques of image segmentation are:

  1. Static Segmentation using RGB or HSV values 
  2. Edge Detection
  3. Background Subtraction
Static segmentation seems to gives good results like in this tutorial but it requires manually segmenting your hand using RGB/HSV values which I don't think will work in a general case with different skin color, different background colors and different lighting conditions.So, this method is out of the question.
Edge detection somehow doesn't seem right either and I don't really know how I would separate edges of my hand from the rest of edges whereas Background Subtraction seems much more intuitive and feasible right now so i'll go with Background Subtraction.(if Background subtraction doesn't work out maybe i'll try to explore Edge detection a little more).

Alright , so Background subtraction it is!

Background Subtraction

As the name suggests , in this technique we try to subtract the background of the image to get the segment of the image that we are interested in.
How do we decide what is the background and what is the target segment?
Well, the object in the frame that is moving is generally the target and the rest is the background.
The most basic way of doing this seems to be to just subtract two consecutive frames.
I got this idea of subtracting two frames from here. So if you wanna know more about it I would recommend watching the video, it is pretty good and it also helped me with the code.
There are also other methods like cv::BackgroundSubtractorMOG2 and cv::BackgroundSubtractorKNN but they don't seem to be giving good resullts , there is too much noise in the output. Maybe i'm missing something and might wanna try tweaking some parameters.
So, i might give these other methods a try later on.
The code for these methods is well explained here

So, the steps I followed were:
  1. Read the video from webcam.
  2. Capture two frames, convert them to grayscale, flip them.
  3. Filtering the image:
        Convert the image to binary using threshold function
        Blur the image to filter some noise using GaussianBlur(Bilateral filter didn't seem to work         because it require different types of images , maybe if I could convert the input frames to the     required format, it might work but i have left it for now).
        Use erode/dilate as required. I used morphologyEx library function which is a combination       of erode and dilute to cancel out some noise and make the contour of my hand better                 connected.
  4. Find the contour and convex hull of the image and display.
So, this was the final result:



Scope of improvement::
  1. The hand contour is not exact and there is some extra noise. Try to tweak some of the parameters of filters and try one of the denoising functions from opencv to remove the noise.
  2. Convex Hull seems pretty good in this picture but it is still not exact and it is not consistent but I think the convex hull and convexity defects , all depend on how good my contour is, so improving my contour using filtering and other techniques should automatically improve these.
  3. There is slight flicker in the output of this method of subtracting the frames which is not present when I use other background subtraction methods like MOG2 or KNN.
    So, that needs to be resolved.
Once these problems are resolved , then I can work on recognizing the hand gesture by maybe counting the number of convexity defects or some other method.

Saturday 2 May 2015

Getting Started with Graph API


In order to get familiar with Facebook's Graph API , i'll write a simple python program that provides the user with the functionality of searching for a particular friend in his/her friends list and displaying some basic details about the friend.

So, Let's get started.

First install the facebook graph API for python from here.
Now, before making any calls to the graph API , I need an access token. In my case I think I need a user access token because my program needs to read my facebook data on my behalf.
So, I will generate the token manually from the facebook graph API gui and select the permissions that I need.


Now, following this excellent tutorial , I make the graph object using my access token.
graph=facebook.GraphAPI(access_token='my_token')

Replace my_token with the token generated in previous step.
In the tutorial they also gave a version attribute to the facebook.GraphAPI call but when I did that , it gave me an error: 'unexpected keyword argument 'version'', so I decided to remove it and now it seems to work fine.

Let's try to get a list of all my friends:

friends=graph.get_connections(id='me',connection_name='friends')
for friend in friends['data']:
    print friend['name']

So, this only gives the name of my friends who have used the graphAPI before.
But how can I get a list of all my friends?
Searching around a bit[1][2][3] led me to the conclusion that I can't get the list of all my friends using GraphAPI v2.0.
Well this was disappointing.
If I can't even get a list of my friends using the graphAPI , I wonder how useful the graphAPI really is?


Math of Intelligence : Logistic Regression

Logistic Regression Logistic Regression ¶ Some javascript to enable auto numberi...