April 24 Technical Journal
Triumphs:
One area of advancement in our project has been our working with Mr. Lee and one of his students to develop a UI (user interface) for our dots and boxes game. Before this, we were playing through the PyCharm terminal so this is a step forward. The first version of the UI is depicted below. Other progress we made was in the integration of our Q-learning work with the Monte Carlo search tree. The Q table would be used instead of random rollout in the Monte Carlo. We also completed the Problem Statement and Background sections for our official report.
![]() |
Preliminary user interface for the dots and boxes game |
Struggles:
Because some people were out due to illness and college visits, some progress was slowed down. Our plan had been to transfer our Q-table into the Monte Carlo Search Tree but formatting differences resulted in incompatibility. As a result, a new Q-table had to be made for the Monte Carlo. The generation of the Q-table (part depicted below) was also an issue as the running of the Q-table generation would not bring much progress. The "Kolo" ratios printed as the games were played fluctuated, improving in the 50/50 direction and then sometimes dropping. Another struggle being encountered is an unclear sense of direction as now that the Q-table is being integrated into the Monte Carlo, we are not sure of what to do next. Robert was out sick for all of last week so that also made things more complicated.![]() |
Part of the Q-table generated |
Code:
Much of the push these past two weeks has been to get the Q-tables to work with the Monte Carlo code (depicted below). The code Will, Puja, and Edmond were working on (Wozzy) would use the Q-table to generate reward and therefore because MontesQ. The Monte Carlo code stored actions in an array of possible actions and the Q-table had all this condensed into number sets corresponding with boxes and moves. We are unsure as to how to complete integration with this format difference so hopefully our visit to Caltech can answer how to translate the number sets into the Monte Carlo compatible array.
![]() |
Code on integrating Q-learning into Monte Carlo |
Accomplishments and Plans:
For the user interface, we plan on linking it with the game by assigning the coordinate points to the lines users can click on. Our plan going forward is that after meeting at Caltech tomorrow, we will be polishing the game up and I think reintegrating with the code section. Because our learning algorithms are being meshed, I think it would be best if we reconvened back into one group. We also plan on making a presentation to accompany our report and have already begun the basic slides. We also look forward to continue completing sections of our report.
Comments
Post a Comment