[go: nahoru, domu]

Skip to content

Latest commit

 

History

History
27 lines (22 loc) · 1.94 KB

README.md

File metadata and controls

27 lines (22 loc) · 1.94 KB

MovieLens_DataSet

The main objective of this project is to analyse the data and create a movie recommender system.
Going in detailed, this project will walk through the steps importing python libraries, loading data into dataframe, optimising dataframe, data manipulation.

We will divide our work in following categories:

  1. Data Analysis
    • Descriptive statistcs: provide ground knowldege about the features and relations within the dataset
    • Visualization: good for overview & understanding underlying relation between data using dynamic plots like plotly and seaborn, and creating wordcloud.

  2. Building Movie Recommendation System
    • Loading Raw Data in a seperate notebook
    • Creating a pivot table in batches and appending the dataframe for optimisation and analysing what possible error one can encounter while running huge batches
    • Computing correlation between columns of data
    • Cleaning up the final movie suggestions

GroupLens Research has collected and made available rating data sets from the MovieLens web site (https://movielens.org). The dataset I’m downloading and using is the “MovieLens 25M Dataset” which includes 25 million reviews. The data sets were collected over various periods of time with the most recent data from 2019.