Jan 17 - 30 Journal

Before going to Caltech, we had narrowed down our ideas to one ambitious and one simpler project. The ambitious one was the e-cigarette image recognition system and the simpler one was going to be the game like tic-tac-toe. Upon getting to Caltech, though, our selection was pretty much dismissed as we went through all of our ideas with the graduate students. This was because Dr. Hassibi had to be in a meeting. Because we received further commentary on the other ideas, the visit really had us back to square one. 


Some things I learned at Caltech exposed possible areas of weakness in our project ideas. The grad students pointed out that for the e-cigarette idea, we would need a lot of data, meaning that we would need a lot of examples of someone vaping in front of the thermal sensor. Robert explained that he saw a person train the image recognition algorithm with just 25 images. While the certainty wasn't great, the answer was typically correct. Because we are focusing on the game more right now, we will explore this issue later. On the haptic glove idea, the grad students suggested maybe using the gloves for typing in addition to translation. We still believed that this project would require much more hardware handling so we decided to not pursue it.





We decided to look into the simple game first. Robert, Connie, and I would focus on the reinforcement learning aspect of it for training the algorithm Puja, Will, and Edmond were coding the game. I watched and heard about the planned code for the game today and was exploring the different reinforcement learning approaches before. The game decided upon was dots and boxes. It is illustrated above and we chose it because it was too simple like tic tac toe with limited possibilities (allowing for brute force learning). It was also not overly complex like chess so it was a right balance. Our plan is to have Mr. Lee's classes play the game to create data to train the algorithm.


While looking into reinforcement learning, I feel that some stuff I got into was too complicated. An example was the Stanford lecture linked above as they brought up equations I had never heard of and did not establish much background. I learned that there are many different algorithms for reinforcement learning though so I hope to discuss with the group as to which one is better. Each algorithm has its own drawbacks and strengths though as some require inputting values while others may be more time-consuming. A simple diagram of what reinforcement learning is like is depicted below. It falls under unsupervised learning and is basically trial error. It runs and learns what actions best achieve its end goal.


Comments

Popular posts from this blog

April 10 Technical Journal

March 13 Technical Journal

Start of School - Sep 12 Journal