This is a data engineering test for a energy company in Brasil.
The goal of the challenge is to download a database in xlsx on Brazilian government fuel data and work with the data in pivot table. We have to structure this data in a way that we can store it in a SGDB and extract pre-defined KPI's from that data.
For the challenge, we were asked to use two pivot tables in question: Sales of oil derivative fuels by UF and product Sales of diesel by UF and type
Oracle Virtual Box with centos 7;
Docker;
Python - jupyter notebook container;
Mysql container.
Development was done in two stages, the first one was handling the file to extract data from the pivot tables and the second part loading this data to mysql.