Big Data MovieLens

Data Split image1

Alternating Least Square image2

Recommendation systems has been all around us. When we watch movies, listen to music, or order takeouts, we are all exposing our personal information, which allows companies to analyze our preferences and recommend items alike for better promotion and user engagement. In this project we aim at building a recommendation system with the MovieLens dataset using various models, including popularity baseline model and latent factor model using Spark’s alternating least squares (ALS) method, and try to find the best hyper-parameters for best recommendation.

My github project

My detailed report

Updated: