Git Product home page Git Product logo

db's Introduction

Build Status All Contributors

ghuser.io's database scripts

This repository provides scripts to update the database for the ghuser.io Reframe app. The database consists of JSON files. The production data is stored on AWS. The scripts expect it at ~/data and this can be overridden by setting the GHUSER_DBDIR environment variable.

The fetchBot calls these scripts. It runs every few days on an EC2 instance.

Table of Contents

Setup

API keys can be created here.

$ npm install

Usage

Start tracking a user

$ ./addUser.js USER

Stop tracking a user

$ ./rmUser.js USER "you asked us to remove your profile in https://github.com/ghuser-io/ghuser.io/issues/666"

Refresh and clean data for all tracked users

$ export GITHUB_CLIENT_ID=0123456789abcdef0123
$ export GITHUB_CLIENT_SECRET=0123456789abcdef0123456789abcdef01234567
$ export GITHUB_USERNAME=AurelienLourot
$ export GITHUB_PASSWORD=********
$ ./fetchAndCalculateAll.sh
GitHub API key found.
GitHub credentials found.
...
/home/ubuntu/data/users
  2654 users
  largest: gdi2290.json (26 KB)
  total: 5846 KB
/home/ubuntu/data/contribs
  largest: orta.json (144 KB)
  total: 14 MB
/home/ubuntu/data/repos
  112924 repos
  65706 significant repos
  largest: jlord/patchwork.json (712 KB)
  total: 203 MB
/home/ubuntu/data/repoCommits
  largest: CocoaPods/Specs.json (3965 KB)
  total: 397 MB
/home/ubuntu/data/orgs
  11072 orgs
  largest: google-certified-mobile-web-specialists.json (445 B)
  total: 3520 KB
/home/ubuntu/data/nonOrgs.json: 252 KB
/home/ubuntu/data/meta.json: 49 B
total: 623 MB

=> 240 KB/user

real    449m19.774s
user    15m52.644s
sys     2m21.976s

Implementation

Several scripts form a pipeline for updating the database. Here is the data flow:

[ ./addUser.js myUser ]   [ ./rmUser.js myUser ]
                 │             │
                 v             v
              ┌───────────────────┐
              │ users/myuser.json │<───────────┐
              └────────────────┬──┘ │─┐        │
                └──────────────│────┘ │        │                    ╔════════╗
                  └────┬───────│──────┘        │                    ║ GitHub ║
                       │       │               │                    ╚════╤═══╝
                       │       v               │                         │
                       │   [ ./fetchUserDetailsAndContribs.js myUser ]<──┤
                       │                                                 │
                       ├────────────>[ ./fetchOrgs.js ]<─────────────────┤
                       │                   ^     ^                       │
                       │                   │     │                       │
                       │                   v     v                       │
                       │      ┌──────────────┐ ┌─────────────────┐       │
                       │      │ nonOrgs.json │ │ orgs/myOrg.json │─┐     │
                       │      └──────────────┘ └─────────────────┘ │─┐   │
                       │                         └─────────────────┘ │   │
                       │                           └──────────┬──────┘   │
                       │                                      │          │
                       ├──>[ ./fetchRepos.js ]<──────────────────────────┘
                       │             ^                        │
                       │             │                        │
                       │             v                        │
                       │  ┌───────────────────────────┐       │
                       │  │ repo*/myOwner/myRepo.json │─┐     │
                       │  └───────────────────────────┘ │─┐   │
                       │    └───────────────────────────┘ │   │
                       │      └────┬──────────────────────┘   │
                       │           │                          │
                       │           │          ┌───────────────┘
                       │           │          │
                       v           v          v
                   [ ./calculateContribsAndMeta.js ]
                           │               │
                           v               v
       ┌──────────────────────┐         ┌───────────┐
       │ contribs/myuser.json │─┐       │ meta.json │
       └──────────────────────┘ │─┐     └───────────┘
         └──────────────────────┘ │
           └──────────────────────┘

NOTES:

  • These scripts also delete unreferenced data.
  • Instead of calling each of these scripts directly, you can call ./fetchAndCalculateAll.sh which will orchestrate them.

Production JSON files

The production JSON files are currently stored on S3 and exposed to front end over HTTPS, e.g.

Every few days a backup named YYYY-MM-DD.tar.gz containing all the JSON files is created, e.g. 2018-10-07.tar.gz.

Contributors

Thanks goes to these wonderful people (emoji key):


Aurelien Lourot

💬 💻 📖 👀

Charles

💻 📖 🤔

Romuald Brillout

🤔

This project follows the all-contributors specification. Contributions of any kind welcome!

db's People

Contributors

lourot avatar

Stargazers

Acampbell avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.