Git Product home page Git Product logo

epimodel's Introduction

Epidemics data processing and modelling toolkit

A data library and a toolkit for modelling COVID-19 epidemics.

Main concepts

  • Region database (continents, countries, provinces, GLEAM basins) - codes, names, basic stats, tree structure (TODO).
  • All data, including the region database, is stored in epimodel-covid-data repo for asynchronous updates.
  • Each region has an ISO-based code, all datasets are organized by those codes (as a row index).
  • Built on Pandas with some helpers, using mostly CSVs and HDF5.
  • Algorithms and imports assuming common dataframe structure (with Code and optionally Date row index).
  • All dates are UTC timestamps, stored in ISO format with TZ.

Install

  • Get Poetry
  • Clone this repository.
  • Install the dependencies and this lib poetry install (creates a virtual env by default).
  • Clone the epimodel-covid-data repository. For convenience, I recommend cloninig it inside the epimodel repo directory as data.
## Clone the repositories (or use their https://... withou github login)
git clone [email protected]:epidemics/epimodel.git
cd epimodel
git clone [email protected]:epidemics/epimodel-covid-data.git data

## Install packages
poetry install  # Best run it outside virtualenv - poetry will create its own
# Alternatively, you can also install PyMC3 or Pyro, and jupyter (in both cases):
poetry install -E pymc3
poetry install -E pyro

## Or, if using conda, install (a likely list): pandas pymc3 unidecode jupyter ...

poetry shell # One way to enter the virtualenv (if not active already)
poetry run jupyter notebook  # For example

Basic usage

from epimodel import RegionDataset, read_csv

# Read regions
rds = RegionDataset.load('data/regions.csv')
# Find by name or by code
cz = rds['CZ']
cz = rds.find_one_by_name('Czech Republic')
# Use attribute access on Region
print(cz.Name)
# TODO: attributes for tree-structure access

# Load John Hopkins CSSE dataset with our helper (creates indexes etc.)
csse = read_csv('data/johns-hopkins.csv')
print(csse.loc[('CZ', "2020-03-28")])

Development

  • Use Poetry for dependency management.
  • We enforce black formatting (with the default style).
  • Use pytest for testing, add tests for your code!
  • Use pull requests for both this and the data repository.

epimodel's People

Contributors

mrinanksharma avatar gavento avatar davidoj avatar anniestephenson avatar euanong avatar jez4 avatar jsalvatier avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.