Git Product home page Git Product logo

bfpd-rs's Introduction

bfpd-rs

Rust PostgreSQL and MariaDB libraries for processing USDA Branded Food Products data. The Cargo workspace also includes a CLI utility for loading CSV and REST API and GraphQL servers.

What's here

The repository is organized as a Cargo workspace.

./mariadb -- MariadDB library
./pg -- PostgreSQL library
./graphql -- graphql server
./restapi -- REST server
./ingest-csv -- cli utility for importing the USDA csv files into the database
./data/pg -- Diesel migration scripts to create the PostgreSQL database and schema.rs
./data/mariadb -- Diesel migration scripts to create the MariaDB database and schema.rs

How to Set-up the Database

You can choose either PostgreSQL (13 recommended) or MariaDB 10. In either case, it's assumed you have a running instance of one or the other.

Step 1. Create an empty schema

For PostgreSQL:

createdb bfpd

For MariaDB:

mysql -u user -p -e"create schema bfpd;"

Step 2: Load the data

At this point, you have a couple of options: 1) download a dump of a recent version of the Branded Food Products database from https://go.littlebunch.com/posgresql for a PostgreSQL dump or https://go.littlebunch.com/mariadb for a MariaDB dump and restore to your local instance or 2) build the database from the ground-up by importing the USDA csv files using the provided ingest-csv command line utility.

Using a dump file

If you are using the first option, download the dump files and restore to your database:

For postgreSQL:

dropdb bfpd5 && createdb bfpd5 # only if database exists
xz -d -c < [downloaded.sql.xz] | psql -U [user] bfpd5

For MariaDB:

mysql -u [user] -p -e'drop database bfpd5' && mysql -u [user] -p -e'create database bfpd5' # only if database exists
xz -d -c [download.sql.xz] | mysql -u [user] -p bfpd5

Using the ingest-csv utility

  1. Build the binary Instructions for building the ingest-csv executable are provided in the ingest-csv/README.md.

  2. Download and unzip the latest csv from the FDC website into a directory of your choice. You will need the Branded Foods and Supporting data for All Downloads zip files:

    wget https://fdc.nal.usda.gov/fdc-datasets/FoodData_Central_branded_food_csv_2021-10-28.zip
    wget https://fdc.nal.usda.gov/fdc-datasets/FoodData_Central_Supporting_Data_csv_2021-10-28.zip
  3. Use the Diesel migration scripts in the data directory to create an empty database

    For PostgreSQL:

    psql -U user -W bfpd < data/pg/up.sql

    For MariaDB:

    mysql -u user -p bfpd < data/mariadb/up.sql

    Note: You can use the up.sql and down.sql scripts to create a diesel migration. This is probably more trouble than it's worth unless you need to change the schema or just want to learn a bit more about diesel migrations.

  4. Load the data by pointing the program to the full path containing the csv

    ./target/release/ingest-cvs -p /path/to/csv/

The load takes about 30 minutes to 2+ hours depending on your hardware. Note: you need to set a DATABASE_URL variable as described in Step 4 in the ingest-csv README.

Step 3 Publish the data

You can use the provided graphql or REST binaries to publish the data. Instructions for building and running the graphql server are provided in the graphql/README.md. Instructions for the REST api are provided in the restapi/README.md.

bfpd-rs's People

Contributors

littlebunch avatar

Watchers

 avatar  avatar

Forkers

davgit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.