Predict flight satisfaction using a decision tree algorithm. This repository includes code for training a decision tree model, evaluating its accuracy, and visualizing key insights such as information gain and correlation matrix.
This repository contains Python code for predicting flight satisfaction using a decision tree algorithm. The decision tree model is trained on a dataset containing information about various flight parameters and passenger satisfaction levels.
-
Clone the repository to your local machine.
-
Install the required dependencies using
pip install -r requirements.txt
. -
Run the
main.py
script to execute the flight satisfaction prediction experiment.bashCopy code
python main.py
The experiment is run for two different subsets of data (2000 and 5000 rows) with random state 42.
- decision_tree.py: Contains the implementation of the DecisionTree class for building and using a decision tree model.
- flight_satisfaction.csv: Dataset containing information about flight parameters and passenger satisfaction.
- main.py: The main script to run the flight satisfaction prediction experiment.
- results/: Directory to store experiment results, including correlation matrix heatmap, distribution plots, and information gain analysis.
- The accuracy of the decision tree model for different subsets of data is printed to the console.
- Correlation matrix heatmap and distribution plots are saved in the
results/
directory. - Information gain analysis results, including a plot, are saved in the
results/rows_{number}/
directory.
-
Clone the repository:
bashCopy code
git clone https://github.com/your-username/Decision-Tree-Flight-Satisfaction-Predictor.git
-
Navigate to the repository directory:
bashCopy code
cd Decision-Tree-Flight-Satisfaction-Predictor
-
Install dependencies:
bashCopy code
pip install -r requirements.txt
-
Run the experiment:
bashCopy code
python main.py
- Python 3.x
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn
- tqdm
Feel free to modify the code and experiment with different parameters or datasets. Contributions are welcome!
This project is licensed under the MIT License - see the LICENSE file for details.