This repo is created to make it easier for machine learning practioners to find great sources of datasets. The goal is to list all sites that share datasets.
Feel free to send a pull request if you have a dataset you'd like to add, or simply notify me about it through submitting an issue.
- Amazon Public Data Sets
- Windows Azure Marketplace
- Yahoo Datasets
- Yelp Academic Datasets
- NYT Linked Open Data
- Google Public Data
- Deeplearning Datasets
- Stanford Large Network Dataset Collection
- UCI Machine Learning repository
- ImageNet
- Million Song Dataset