A simple reinforcement learning project

After seeing that I haven’t updated my blog in a while (I’ve been so busy…), I realized that I need to generate some content. What follows is a blog post I wrote in January 2011 for another blog that described a project I did for reinforcement learning. This little project later spawned my research and has been developed many times further. What I’m doing now is way beyond this little project, however I decided to post this for posterity (and to produce some content). Hopefully I’ll make a post in the future describing what I’m doing now.

Original Post follows

have been asked to write about one project in particular, so I have decided to talk about something I worked on more recently for my Reinforcement Learning with AI class (ECE 517). As part of the final project we had to apply reinforcement learning techniques to some problem. I had a project related to visual attention mechanisms.

There’s a lot of theory behind reinforcement learning, and I am debating over how detailed I want to make this description. I think if I just skim over many of the details there is more than enough to cover. If you want a lot of details then I will link the PDF for our project report at the end. There is also an excellent free online book that covers basic reinforcement learning stuff linked here. (It was the textbook for our class.)

Basically, the idea is that most pattern recognition can easily be done if the center of focus is the target. We can make computers to recognize faces, or text, or cats if the image consists mostly of that object. The trouble starts when we need a computer to find the object in question in a very large image. Say we want to know if a cat is present in a large image with many things present. This is not a trivial task primarily because of the curse of dimensionality as Richard Bellman framed it.

Anyway, we as humans can do this almost effortlessly as we scan a scene with our eyes. In the image above, our eyes naturally spot the cat immediately. Being inspired by how people and animals do this effortlessly, we attempted to model it by creating a small image that represented the focal point of the eye (i.e. where the gaze is directed). Through reinforcement learning techniques we attempted to train an agent to be able to shift the focal image around and locate a target. This project was implemented in Java.

We did not have a good pattern classifier, so we used histograms. We were well aware of the limitations of using histograms when starting the project, so that was already a given. We wanted our agent to be able to “learn” the scene as it directed it’s gaze around the image. It’s state space would essentially grow as it encountered new patterns. In order to test that the state space we had was indeed working, I wrote a program to scan across an image and generate a map of all of the states it encountered. The below image shows a sample of how the state space might appear for the above image of a cat in a field. The colors themselves are random and mean nothing, they just used to represent unique states.

Anyway, the goal was to train the agent to find the cat given the above state space. It used a common reinforcement learning algorithm called sarsa(lamda) to learn how to find the cat. After running for many iterations it would learn to head directly for the cat as the images below show. (The green line represents the path it took.)

Before Training

After Training

Even after I saw that the program seemed to be learning, it still wasn’t enough for me. we wanted to be able to visualize what was happening with all of the states. I wrote a program to take the learned state space and be able to do this. It generated the beautiful image below where the color hue represents the direction it learned to travel, and the brightness refers to the value of that state. You might want to click on the image to view it enlarged in order to see it better. The legend for the colors is in the upper left corner. As you can see, the brightest part is where the cat is located, and the colors tend to correspond to directions that point “toward” the cat. I was really excited when I got this image from my program. It makes an excellent visual confirmation that everything was working as it should.

For more information see the PDF file of our report Here. It even contains more cool images, and talks about other tools that were developed for this. I want to mention that I was working with another student, Mike Franklin, on this. He really helped with ideas and inspiration, and also wrote most of the above paper. I did most of the coding and the design.

Posted in Uncategorized

Google sparse_hash_map goodies

So, lately I’ve been playing with google’s sparse_hash_map class. This class is incredibly useful if you need an easy to use, and functional hash table in C++, but don’t want to use the mess that is boost. Of course there is std::map, but that has it’s own issues (namely, it isn’t really a hash table).

google::sparse_hash_map is actually very easy to set up and use. The only problem is, it isn’t documented very well at all. Hopefully this post will remedy that by showing code samples for a few really common operations that I always find myself doing with hash tables.

One common operation that one might want to do, is loop through and enumerate all elements in a hash table. This is actually difficult to figure out with sparse_hash_map’s poor documentation. What isn’t obvious is that each element returns a std::pair. The code below should do it. It operates on a google::sparse_hash_map<string,int> string_hash; that was defined earlier.

pair<string,int> keyval;
google::sparse_hash_map<string,int>::iterator hash_it;

for(hash_it = string_hash.begin();hash_it != string_hash.end();++hash_it){
    keyval = *hash_it;
    cout << keyval.first << " : " << keyval.second << endl;
}

Another common operation, is retrieving a value from the hash, while returning a default value if that value wasn’t found. This turns up in many places where hashes are used. For instance, you might want to count everything with a given name. You can do this by creating a hash table with the name as the key. If you had a way of getting a default count of “0” if the item wasn’t found, you could start counting from 0. Anyway, I digress. A simple naive function to return a default value is given below (bewarned that it isn’t optimized).

int get_sparse_hash_default_unoptimized(google::sparse_hash_map<string,int>
        &hash,string &key,int d=0){
    google::sparse_hash_map<string,int>::iterator hash_it;
    if((hash_it = hash.find(key)) != hash.end()){
        return hash[key];
    } else {return d;}
}

The thing that wasn’t obvious from the docs to me is that hash.find(key) returns hash.end() if the element wasn’t found. This allows us to return a default value in that case. The problem with the above version is that does 2 lookups, one to determine if the item was found, and another to actually do the lookup. Presented below is a better version that only does one lookup. I haven’t actually tested and compared performance, but intuitively one would think the below version would be much faster.

int get_sparse_hash_default(google::sparse_hash_map<string,int>
        &hash,string &key,int d=0){
    google::sparse_hash_map<string,int>::iterator hash_it;
    if((hash_it = hash.find(key)) != hash.end()){
        return (*hash_it).second;
    } else {return d;}
}
Posted in Uncategorized

First Post.

So, I plan to blog about various technical topics. This is just a first post to test my new wordpress install (it may be deleted later).

Posted in Uncategorized