Git Product home page Git Product logo

cough-against-covid's Introduction

Cough Against COVID-19

Code relase for the Cough Against COVID-19 Project by the Wadhwani Institute for Artificial Intelligence supported by USAID and the Gates Foundation

Cough Against COVID-19

In order to use this code, you need to follow the steps below. Please check the pre-requisites to decide whether your system is compatible.

Pre-requisites

  • CPU-only machine OR GPU-enabled machine
  • Docker installed on your machine
  • OS: Linux/Mac OS

๐Ÿ Note: This code has been tested on Mac OS and Ubuntu.

Setup

We use docker to manage code dependencies. Please follow the steps here to set up all dependencies. This code works on both CPU-only machine/ GPU machine. However, it is recommended to use a GPU machine since CPU machine is very slow in runtime.

Using our trained models

We release models trained to predict COVID-19 status from cough/contextual (symptoms etc.) metadata.

Data version files

For the datasets used in this work, we create our own split files and those are released publicly. Please run the following (from inside docker container) to download them to assets/data/ folder.

python setup/download_data_splits.py

Pre-trained Models

Broadly, we release trained checkpoints for three kinds of models:

  • Cough-based ResNet-18 models for cough-detection
  • Cough-based ResNet-18 models for COVID-detection
  • Context-based TabNet models for COVID-detection

Please run the following (from inside docker container) to download them to assets/models/ folder.

aws s3 sync --no-sign-request --region=ap-south-1  s3://covid-ml-data/ assets/

Demo notebooks

To try out our model(s) on sample data, please follow the instructions.

  • Cough-based model: Follow the notebook here to predict COVID from cough using a pretrained model released with the repository. If you want to try on your own cough samples, you can record and store them in assets/data/ and run the notebook by changing appropriate paths.

  • Context-based model: Follow the notebook here to predict COVID from contextual features like age, symptoms, travel history etc. If you want to try on your own contextual-features, you can modify the relevant cells and run the notebook.

Training/evaluating/fine-tuning your own models

In order to use our and other public datasets as part of this work, you will need to first download, process the datasets and then create your own configs to train models.

Datasets

We use a combination of publicly-available datasets and our own collected datasets. Please follow the steps here to download, process all datasets.

โš ๏ธ Note: Our own dataset wiai-facility collected from across 27 facilities in India has not been released yet due to legal constraints that prevent us from sharing the data. We are trying to resolve those before we can release the dataset in any form.

Training

Training on existing datasets

To run training on datasets downloaded in previous step, please follow the steps here.

Training any custom model on any given dataset

In order to train on your own dataset(s), first, you need to set up the dataset following steps similar to those for existing dataset given here. This includes downloading/setting it in the right folder structure, processing and splitting (train-validation-test). Next, you need to create a new .yml config file (like this) and configure the dataset section:

dataset:
  name: classification_dataset
  config:
    - name: <name-of-your-dataset>
      version: <version-of-your-dataset>

You can also play around with various other hyperparameters in the config like optimizer, scheduler, batch sampler method, random crop duration, network architecture etc.

๐Ÿšง More coming soon!

Evaluation

You can evaluate your own trained models or use released model checkpoints on a given dataset. Instructions for both of these are given here.

Evaluating any custom model on any given dataset

๐Ÿšง Coming soon!

Documentation

๐Ÿšง Coming soon!

Citing us

If you find this code useful, kindly consider citing our papers and starring our repository:

@misc{sharma2021impact,
      title={Impact of data-splits on generalization: Identifying COVID-19 from cough and context}, 
      author={Makkunda Sharma and Nikhil Shenoy and Jigar Doshi and Piyush Bagad and Aman Dalmia and Parag Bhamare and Amrita Mahale and Saurabh Rane and Neeraj Agrawal and Rahul Panicker},
      year={2021},
      eprint={2106.03851},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

@misc{bagad2020cough,
      title={Cough Against COVID: Evidence of COVID-19 Signature in Cough Sounds}, 
      author={Piyush Bagad and Aman Dalmia and Jigar Doshi and Arsha Nagrani and Parag Bhamare and Amrita Mahale and Saurabh Rane and Neeraj Agarwal and Rahul Panicker},
      year={2020},
      eprint={2009.08790},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

Code Contributors (in alphabetical order):

And Jigar Doshi the Research Lead of the project

Acknowledgements:

Reporting issues/bugs/suggestions: If you need to bring our attention to bugs or suggest modifications, kindly create an issue and we will try our best to address it. Please feel free to contact us if you have queries.

cough-against-covid's People

Contributors

jigar-wiai avatar ms-wiai avatar piyush-bagad avatar shenoynikhil98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

cough-against-covid's Issues

Setup README [readability] improvements

Readability

Can we be clear on what the below sentence is trying to say?

Setup data and output folders: In order to run code for this project, we expect a certain directory structure for storing dataset(s) and model outputs. For example, suppose you create a common folder: /Users/piyushbagad/cac/. The data and outputs will reside at /Users/piyushbagad/cac/data/ and /Users/piyushbagad/cac/outputs/ respectively. Next, in the outputs/ folder, create a folder by your name (e.g. piyush/).

Can the above statement be replaced by the text below?

Create following folder structure:

example/folder/
          โ”œโ”€โ”€ data
          โ””โ”€โ”€ outputs
              โ””โ”€โ”€ <user>

Issues

[Resolved: Tried CPU only code] When running the command bash create_container.sh -g 0 -n sample-container -e ../data-outputs/ -u apoorv -p 8001, I got the following error.

=> Firing docker container with nvidia-docker
create_container.sh: line 45: nvidia-docker: command not found

I don't have sudo permissions on on-prem to be able to install nvidia-docker.

[INSTALL] Issue in installation step(s)

Select tool that caused the issue:

  • [x ] Docker
  • Jupyter
  • CUDA related
  • Missing pip package/other dependencies
  • Others

Describe the issue and add error trace

1
python setup/download_data_splits.py

=> Downloading to assets/data.zip
Permission denied: https://drive.google.com/uc?id=1g9B7oDB4d5X-pJ4tLI5NbpyV_qmHi-DK
Maybe you need to change permission over 'Anyone with the link'?
=> Unzipping /workspace/cough-against-covid/assets/data.zip to /workspace/cough-against-covid/assets
Traceback (most recent call last):
File "setup/download_data_splits.py", line 26, in
with zipfile.ZipFile(zip_file, 'r') as zip_ref:
File "/opt/conda/lib/python3.6/zipfile.py", line 1113, in init
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: '/workspace/cough-against-covid/assets/data.zip'

2.
python setup/download_model_ckpts.py

Cannot download data splits nor checkpoints.
=> Downloading to assets/
Permission denied: https://drive.google.com/uc?id=1fkuQOEL3V7tDSMo0TvzKvWiMzuRjHgvd
Maybe you need to change permission over 'Anyone with the link'?
=> Unzipping None to /workspace/cough-against-covid/assets
Traceback (most recent call last):
File "setup/download_model_ckpts.py", line 27, in
with zipfile.ZipFile(zip_file, 'r') as zip_ref:
File "/opt/conda/lib/python3.6/zipfile.py", line 1131, in init
self._RealGetContents()
File "/opt/conda/lib/python3.6/zipfile.py", line 1194, in _RealGetContents
endrec = _EndRecData(fp)
File "/opt/conda/lib/python3.6/zipfile.py", line 264, in _EndRecData
fpin.seek(0, 2)
AttributeError: 'NoneType' object has no attribute 'seek'

Describe your system configuration

  • Operating system
  • [ x] Linux
  • Mac OS
  • Windows
  • Other

NotebookApp.password exposure

The "NotebookApp.password" appears in this file. What is its purpose?

tmux send-keys -t JupLabSession.0 "cd ../; jupyter lab --no-browser --ip 0.0.0.0 --port $1 --NotebookApp.password='sha1:2886cd70fbe8:f9e30a74d8464ac7320cbd6f81566093ac8e1cb2'" ENTER

Is inclusion in this script intentional? Is this something the user should/can generate on their own?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.