The goal of this project is to investigate a dataset of Movies records from 1980 to 2020. The data includes some attributes of movies. The analysis should be focused on finding correlation influencing movies gross revenue.
The original data set can be found here: https://www.kaggle.com/danielgrijalvas/movies
The most important findings are:
- The movies are recorded from 1980 to 2020.
- Budget and votes have the highest correlation to gross revenue with value of 0.711 and 0.629.
- The least correlated features are runtime and score with values of 0.24 and 0.18 respectively.
- Data Wrangling
- Exploratory Data Analysis (EDA)
- Examination of central tendency and spread
- Data visualizations
- Python, libraries: numpy, pandas, matplotlib, seaborn, datetime.
- Jupyter Lab