Git Product home page Git Product logo

epicollect / epi-collect Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 3.0 7.45 MB

🌏🌎🌍 Liberate Google Takeout location data for epidemiological research and local contact tracing https://epi-collect.org

License: MIT License

Makefile 0.79% Dockerfile 0.75% Python 19.95% HTML 0.66% TypeScript 75.29% CSS 2.56%
covid-19 covid19 coronavirus coronavirus-tracking google-takeout epidemiology epidemiology-analysis

epi-collect's Introduction

Epi-Collect uses location data from Google Takeout to build an open source contact tracing dataset.

Website β€’ Slack β€’ Roadmap β€’ FAQ β€’ Privacy

Current Engineering Milestone Current Researchers
Pre-launch (2 active contributors) Become our first researcher
  • Establish privacy-respecting best practices for data donation
  • Create a community-driven dataset standard for contact tracing
  • Enable researchers and city health departments to investigate the spread of COVID-19 and other diseases using donated data
Is my data kept safe and private?
Yes, and we empathize with your concern. The biggest problem with recent contact tracing solutions is that they may be a gateway to surveillance capitalism in the name of public safety. There is a shrinking window of opportunity available today to set a precedent for privacy-respecting contact tracing. As an open source project with all documentation in the open, Epi-Collect is in a unique position to do that. No one has scaled open source data donation before, and we're excited to test its potential.

Check out our Privacy living document to see how we think about this and how we hope others will too.
Is my data anonymized?
Yes.

  • We’ve designed our database such that there is no possible way to associate location data with your identity. If you’re an engineer, you can see our very simple database schema here.
  • During data ingestion, we ask users to review every data point and delete those that they believe are personally identifiable. We also give hints about what data points may be personally identifiable.
  • We do not make the dataset available to a researcher unless they pass certain verification requirements.
Please see our Privacy living document for more details.
How do I get access to the data?
Please see our guidance for researchers.

Full FAQ

Setup

Make sure you have yarn and virtualenv installed.

git clone [email protected]:epicollect/epi-collect.git
cd epi-collect
yarn install
virtualenv --python=python3.6 venv
./venv/bin/activate
source ./venv/bin/activate
pip install -r requirements.txt

Run for development

To start:

make run-dev
export PYTHONPATH="$PWD"
make run-db-local

To stop:

make stop-dev
make stop-db-local

If you want to test using the docker containers (which is closer to deployment):

make build-docker
make run-docker-local

Local and deployment structure

The frontend is built in React with TypeScript. We use React Bootstrap for the UI.

The backend is built using Flask and uses GeoAlchemy (GIS extension on top of SQLalchemy) to communicate with a PostGIS database for persistent storage.

Local

Locally you can run in two ways:

  1. Using yarn and flask (make start-dev), in which case all traffic on /api is routed to flask. In this setup, make run-db-local will spin up a local PostGIS instance with the correct schema.

  2. Using docker-compose in which case the same docker containers as in the actual deployment are created, but they are span up locally using docker-compose. The database doesn't work in this setup.

Testing

Manual

A Google Takeout zip file with location data is located unter tests/data/sample_location_history.zip.

Automatic

See tests/test_api.py.

Deployment

We deploy using make deploy (you need AWS access for this) which builds the following docker containers:

  1. nginx container to serve the frontend React app.
  2. gunicorn container to serve the Python backend.

These are pushed to Docker Hub. We then deploy this to AWS Elastic Beanstalk, where we have a nginx reverse proxy behind AWS' load balancer, which routes all traffic on /api to the gunicorn container and all other traffic to the frontend nginx container.

There is also a PostGIS database running in AWS RDS (Postgres with PostGIS extensions enabled).

epi-collect's People

Contributors

larsmennen avatar nessup avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

epi-collect's Issues

Unzip data on the client

Currently we unzip the Google Takeout data on the server. This works well, but requires large data uploads and processing power on the server. It'd be better if we can do this on the client.

In data deletion process, show user number of data points that will be deleted

Current state

When a user wants to delete their data using their token on https://epi-collect.org/delete, at the moment as soon as they click submit all their data will be deleted (after it has been verified to be a correct token in the backend).

Desired state

When a user wants to delete their data:

  1. They enter their token
  2. They see in the UI that this will delete X locations and Y answers to questions they submitted.
  3. They can confirm deletion.

Changes required

  • In the backend (api.py), the current delete route will need to be split into two: one route that returns the number of data points and one to confirm deletion.
  • In the frontend, when submitting, we first call the first route, then show a modal / other UI component to the user stating how many data points there are and allow them to confirm. Then call the second route and delete the data.

Fix warning about state update on unmounted component

Description

When navigating back to the map to filter data, there's a minor bug where clicking on a previously drawn polygon causes an attempted state update on an unmounted component.

Steps to reproduce

  1. Go to http://localhost:3000/wizard
  2. Upload some data (e.g. in tests/data)
  3. Draw some polygons on the map
  4. Click 'Next'
  5. Now go back via the breadcrumbs to 'Review and filter data'.
  6. Click the "hand" icon to go into dragging mode.
  7. Click on the polygon.
  8. Console will throw:
index.js:1 Warning: Can't perform a React state update on an unmounted component. This is a no-op, but it indicates a memory leak in your application. To fix, cancel all subscriptions and asynchronous tasks in the componentWillUnmount method.
    in GeoMap (at GeoMap.tsx:23)
    in Unknown (at SelectData.tsx:21)
    in div (at SelectData.tsx:20)
    in SelectData (created by Context.Consumer)
    in Route (at Wizard.tsx:85)

This warning should not appear.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.