After seeing that I haven’t updated my blog in a while (I’ve been so busy…), I realized that I need to generate some content. What follows is a blog post I wrote in January 2011 for another blog that described a project I did for reinforcement learning. This little project later spawned my research and has been developed many times further. What I’m doing now is way beyond this little project, however I decided to post this for posterity (and to produce some content). Hopefully I’ll make a post in the future describing what I’m doing now.
Original Post follows
have been asked to write about one project in particular, so I have decided to talk about something I worked on more recently for my Reinforcement Learning with AI class (ECE 517). As part of the final project we had to apply reinforcement learning techniques to some problem. I had a project related to visual attention mechanisms.
There’s a lot of theory behind reinforcement learning, and I am debating over how detailed I want to make this description. I think if I just skim over many of the details there is more than enough to cover. If you want a lot of details then I will link the PDF for our project report at the end. There is also an excellent free online book that covers basic reinforcement learning stuff linked here. (It was the textbook for our class.)
Basically, the idea is that most pattern recognition can easily be done if the center of focus is the target. We can make computers to recognize faces, or text, or cats if the image consists mostly of that object. The trouble starts when we need a computer to find the object in question in a very large image. Say we want to know if a cat is present in a large image with many things present. This is not a trivial task primarily because of the curse of dimensionality as Richard Bellman framed it.
Anyway, we as humans can do this almost effortlessly as we scan a scene with our eyes. In the image above, our eyes naturally spot the cat immediately. Being inspired by how people and animals do this effortlessly, we attempted to model it by creating a small image that represented the focal point of the eye (i.e. where the gaze is directed). Through reinforcement learning techniques we attempted to train an agent to be able to shift the focal image around and locate a target. This project was implemented in Java.
We did not have a good pattern classifier, so we used histograms. We were well aware of the limitations of using histograms when starting the project, so that was already a given. We wanted our agent to be able to “learn” the scene as it directed it’s gaze around the image. It’s state space would essentially grow as it encountered new patterns. In order to test that the state space we had was indeed working, I wrote a program to scan across an image and generate a map of all of the states it encountered. The below image shows a sample of how the state space might appear for the above image of a cat in a field. The colors themselves are random and mean nothing, they just used to represent unique states.
Anyway, the goal was to train the agent to find the cat given the above state space. It used a common reinforcement learning algorithm called sarsa(lamda) to learn how to find the cat. After running for many iterations it would learn to head directly for the cat as the images below show. (The green line represents the path it took.)
Even after I saw that the program seemed to be learning, it still wasn’t enough for me. we wanted to be able to visualize what was happening with all of the states. I wrote a program to take the learned state space and be able to do this. It generated the beautiful image below where the color hue represents the direction it learned to travel, and the brightness refers to the value of that state. You might want to click on the image to view it enlarged in order to see it better. The legend for the colors is in the upper left corner. As you can see, the brightest part is where the cat is located, and the colors tend to correspond to directions that point “toward” the cat. I was really excited when I got this image from my program. It makes an excellent visual confirmation that everything was working as it should.
For more information see the PDF file of our report Here. It even contains more cool images, and talks about other tools that were developed for this. I want to mention that I was working with another student, Mike Franklin, on this. He really helped with ideas and inspiration, and also wrote most of the above paper. I did most of the coding and the design.