Challenge description is located here.
- docker (tested on v19.03.3)
- docker-compose (tested on v1.24.1)
- make (tested on GNU Make v4.2.1)
Start local environment
Start Apache Spark cluster, MySQL database (with fake data), dummy data generator (over TCP socket), Apache Spark driver program & REST API
NOTE: First run can take some time because dependant Docker images are pulled and build.
$ make up
Scale Apache Spark worker nodes
$ make scale n=3
Stop local environment
$ make down
OPTIONAL: Data is still persisted in the named volume, remove it with following command:
$ make clean
Question 1: How many impressions were trafficked each day for each campaign?
curl http://localhost:8080/api/campaigns/timeseries
Question 2: How many impressions, interactions and swipes were trafficked for each ad in the specific campaign?
curl http://localhost:8080/api/campaigns/1/ads
Question 3: How many unique users and impressions were trafficked each day for each ad in the last 7 days?
curl http://localhost:8080/api/campaigns/ads/lastweek
- Split driver program into smaller functions
- Move hard coded configurations to environment variables
- Add unit tests