Investigating a dataset using pandas.
- In this project,a dataset is analyzed and then findings are communicated about it.Python libraries NumPy, pandas, and Matplotlib are used to make analysis easier.
- This dataset collects information from 100k medical appointments in Brazil and is focused on the question of whether or not patients show up for their appointment. A number of characteristics about the patient are included in each row.
- The project uses NumPy arrays and Pandas Series and DataFrames where appropriate rather than Python lists and dictionaries. Where possible, vectorized operations and built-in functions are used instead of loops.
- The code makes use of functions to avoid repetitive code. The code contains good comments and variable names, making it easy to read.
- The project investigates the stated question(s) from multiple angles.Three variables are investigated using both single-variable (1d) and multiple-variable (2d) explorations.
- The project's visualizations are varied and show multiple comparisons and trends. Relevant statistics are computed throughout the analysis when an inference is made about the data.
- The results of the analysis are presented such that any limitations are clear. The analysis does not state or imply that one change causes another based solely on a correlation.