- Python3
as language
- Scrapy
as framework
- Docker
as tool
- Docker-compose
as tool
To run this crawler you gonna need:
- ๐ณ Docker
v19.03.8+
https://www.docker.com/products/docker-desktop - ๐ณ Docker-compose
v1.25.4+
https://docs.docker.com/compose/install/
P.S: If you are using windows or MacOs, your docker-compose already comes with the default installation.
โ What about Python3 and Scrapy?
Do not worry, docker will take care of their installation
$ docker-compose --version; docker --version
The above code must output the version of each tool
$ git clone https://github.com/gustavobordinho/data-pirates-challenge.git
$ cd data-pirates-challenge
$ git checkout correios-crawler
$ docker-compose build
$ docker-compose up -d
$ docker-compose exec crawler bash -c 'scrapy crawl correios'
Once the crawling is done, all the information will be inside scrapy
folder as .jsonl
files, separeted by UF.