Nov 1 - 14 Journal
On our visit last Thursday to Caltech, the group and I meet with graduate student Ethan and discussed recommender systems. Recommender systems are used all around us from Youtube to Spotify but we focused on the Netflix recommender system as Netflix actually has datasets published. Ethan told us that this dataset is out there because Netflix is challenging people to develop an even better recommender system than their current one using that data. He added that there was a prize of $1 million dollars so that also added some extra incentive.
The dataset that is used to make the recommender system is similar to that pictured above with a matrix created by users and their ratings of movies that they have watched. The empty spaces are the movies that users have not watched yet and we are trying to solve which blank spaced movies should be recommended to the user. After discussing, we explored three ways to break down the matrix and make recommendations. The first is similarity comparison where a user is compared with other users. The movies watched by users with similar tastes will then be recommended to the specific user. The more complicated second process is feature extraction where an algorithm breaks the matrix up into two matrices. The first matches users with feature preference and the second matches movies with features present. The final method we learned was simple clustering which is supplemental to the other two methods. Because this topic has very applicable, everyday life uses, I am very excited to further explore recommender systems these coming weeks.
The days after the Caltech visit, the group and I tried to test out matrix multiplication processes to try to further understand how the feature extraction process worked. Edmond and I tried to figure out how the matrices would multiply to get the original matrix as matrix multiplication involves summing the products of the numbers. We deduced that instead of actually creating the original matrix, the user vs feature and movie vs feature matrices are multiplied and a threshold value was applied to convert the values into favorable (1) or unfavorable (0). This seemed to make sense as if a movie's features only matched with a select few user preferences, then that movie should be disregarded or prioritized less as a recommendation.
I also enjoyed finishing up my partner concept map with Connie on the Friendship Paradox. While it was kind of sad to figure out why your friends on average have more friends than you, it was fun to explore the proof and its applications. Connie did the video on the proof (above) and I did the video on the paradox's applications and comparison to everyday events (below). In it, I compared the friendship group to a gym and how people are right when they think they are less fit than the average person at the gym. Because the social network (gym membership) appeals to the health conscious, that group is going to be more fit like how friendly people who make up your group will be more social and outgoing.
Comments
Post a Comment