This project is a part 6 months Data Science bootcamp - Sages Kodołamacz Data Science Bootcamp, which I took part September 2019 - March 2020.
This machine learning project predicts price of apartment for sales in Kraków.
All data sets come from otodom.pl and were restricted only to Kraków location (end of Feb'20).
- Python 3.6
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Sklearn
- String
- Re
- Warnings
- Selenium
- IPython
After scrapping data, all is cleared and some addtional features are added to manage to estimation in the correct way. Then several estimator are validated to find the best aproximation.
- '01. Webscrapping_apartments for sale'
- '02. Price_estimation'
As expected the best estimator is Gradient Boosting with defined parameters max_depth: 100, min_samples_leaf: 15, n_estimators: 200. The result is not an amazing achievement with R^2 score = 0.63 and Mean Absolute Percentage Error = 14 but could be expected considering numerous of different parameters and lack of data.