Git Product home page Git Product logo

globant_challenge's Introduction

Globant's Data Engineer Challenge

Architecture

Process

  1. I used a Amazon RDS for Postgresql to create the tables used in the challenge.
  2. I created the secrets such as the database parameters using AWS Secrets Manager.
  3. Once a development was ready to test it out, I created an image using docker build -t <tag> . and uploaded it to ECR. This guide was helpful to achieve it.
  4. I used ECS to launch Fargate, creating a Task Definition and running a service on a cluster.
  5. If everything worked correctly, I commited the changes to the repo.
  6. Repeated step 3 to 5.

TODO

  • Add an API Token to restric access.
  • Add tests.
  • Use Python CDK to automate deployment.
  • Use Github Actions for CI.
  • My original idea was to save an CSV file into S3, then Lambda function would be triggered passing the S3 URL to the API & upload the file to the database. It would be a nice to have.

globant_challenge's People

Contributors

nmema avatar

Stargazers

Nazareno Medrano avatar

Watchers

 avatar

globant_challenge's Issues

FastAPI

Structure the project for the utilization of the framework FastAPI.

At least it should be runned on localhost. Ideally on a container.

populate README

Add information about how you made the challenge, include architecture & improvements you would do to it.

Historical Data Endpoint

The API should be able to receive CSVs, process and insert them into a DB.

  1. Read CSV files.
  2. Upload the file to an EXISTING table in the DB.
    2.1. Create the DB & tables
  3. Return status

This could be a good moment to deploy the solution on the cloud (optional)

metrics endpoints

Add two endpoints for this two requirements:

  • Number of employees hired for each job and department in 2021 divided by quarter. The table must be ordered alphabetically by department and job.
  • List of ids, name and number of employees hired of each department that hired more employees than the mean of employees hired in 2021 for all the departments, ordered by the number of employees hired (descending).

init repo

Init repo with:

  • poetry
  • pre-commits
  • data needed for the challenge

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.