Full Data Analysis and Visualization using Python on a 10,000-row Top Moveis Database
Here, I have done a full data analysis and visualization on the top movies database using python libraries like: pandas, NumPy and Matplotlib.
The goal of the analysis is to answer 3 questions and draw conclusions through the analysis process using visualization tools in python.
Questions:
What are the most popular genres from year to year?
What are the properties of the highest revenue films?
What is the inflation rate throughout the time?
Throughout this analysis process I have answered these 3 questions through 3 phases:
Data Wrangling phase:
throughout this phase, I investigated the dataset looking for duplicated rows and null values to handle them in a nice way where I don't spoil the data.
fixing columns data type to get accurate results.
removing unuseful columns from the dataset to keep the process nice and clean.
Exploratory Data Analysis phase:
throughout this phase, I answered the 3 questions I stated above, by doing some statistic calculations.
supporting those calculations with visuals to make it easy to understand.
drawing conclusions about each result.
Conclusions phase:
throughout this phase, I explained my findings and insights I got from data to non-technical clients.
the limitation I faced throughout the process.
full-data-analysis-and-visualization-using-python-on-a-10-000-row-database's People