Git Product home page Git Product logo

datameet / covid19 Goto Github PK

View Code? Open in Web Editor NEW
121.0 14.0 113.0 396.79 MB

Novel Corona Virus - COVID-19 India Datasets by DataMeet. Sunsetted on 2022-10-21. Will not update anymore.

Home Page: https://projects.datameet.org/covid19/

License: Other

HTML 99.95% Shell 0.01% Python 0.02% CSS 0.01% JavaScript 0.02%
datameet covid19 covid19-data covid19-tracker covid19-india india opendata covid-19

covid19's Introduction

datameet

covid19's People

Contributors

aadityadar avatar arvnd avatar geohacker avatar jebertprime avatar kushalre avatar tbnv999 avatar thejeshgn avatar vinaywadhwa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covid19's Issues

No Testing Data for 12-08-20

@thejeshgn
I'm running a cron job for a project that fetches the testing data from icmr_testing_status.json at 05:00 every morning.

The job failed yesterday due to missing data for the day. Just wanted to know whether this was a one-off error or do I need to add some data validation to the script.

Appreciate the resource you guys have put together.

Update : Error was in the script.

Data inconsistencies

In the daily cases of MP state for 13-04-2021 the number of cases is wrong.

{
      "id": "2021-04-12T08:00:00.00+05:30|mp",
      "key": "2021-04-12T08:00:00.00+05:30|mp",
      "value": {
        "_id": "2021-04-12T08:00:00.00+05:30|mp",
        "_rev": "1-933316a7408ba6a730c23622d73e30f3",
        "state": "mp",
        "report_time": "2021-04-12T08:00:00.00+05:30",
        "cured": 298645,
        "death": 4184,
        "confirmed": 338145,
        "source": "mohfw",
        "type": "cases"
      }
--
    },
    {
      "id": "2021-04-13T08:00:00.00+05:30|mp",
      "key": "2021-04-13T08:00:00.00+05:30|mp",
      "value": {
        "_id": "2021-04-13T08:00:00.00+05:30|mp",
        "_rev": "1-a5060878199d1a56e015cf657efecad6",
        "state": "mp",
        "report_time": "2021-04-13T08:00:00.00+05:30",
        "cured": 301762,
        "death": 4221,
        "confirmed": `34464`,
        "source": "mohfw",
        "type": "cases"
      }
--
    },
    {
      "id": "2021-04-14T08:00:00.00+05:30|mp",
      "key": "2021-04-14T08:00:00.00+05:30|mp",
      "value": {
        "_id": "2021-04-14T08:00:00.00+05:30|mp",
        "_rev": "1-80b865800fe853b2ee610ea6b1b43d25",
        "state": "mp",
        "report_time": "2021-04-14T08:00:00.00+05:30",
        "cured": 305832,
        "death": 4261,
        "confirmed": 353632,
        "source": "mohfw",
        "type": "cases"
      }

Error in 2021-04-13 MoHFW data

Error: #32

Should I fix the raw downloaded JSON file

2021-04-13T08:00:00.00+05:30_md5_fe0e82466266d5fe3b71e051d289810b.json

As of now I have manually fixed the parsed data.

Error in data.

The data contained in data/mohfw.json has a slight error. the numbers for kerala are

"report_time":"2020-07-30T08:00:00.00+05:30","cured":11365,"death":68,"confirmed":21797

"report_time":"2020-07-31T08:00:00.00+05:30","cured":12159,"death":70,"confirmed":2203,

"report_time":"2020-08-01T08:00:00.00+05:30","cured":13023,"death":73,"confirmed":23613

for date 31 there is a slight inconsistency maybe with the scraping

mohfw json file is incomplete

Last file that was updated for mohfw.json, something went wrong and it has incomplete data.
It terminates abruptly at line 439.
You guys are really doing amazing work and I used your json file to create a dashboard. however, it is failing now due to incomplete json file.
I would really appreciate if you can fix this.
Thanks in advance.

Do you have district wise data for COVID-19 India?

Do you have district wise data for Indian states? The data on MOHFW where district wise PDF is maintained is not updated. So, trying my luck if you have any source that can be reference or you already have the data.
Also, a request, I hope you have seen my dashboard, if not then please take a look and can you refer it in your website under "Projects Using this dataset", it would be really great :) Thanks in advance...
https://app.powerbi.com/view?r=eyJrIjoiNWEyNThlZTItYTY3MC00NDM5LWEyYTgtZDBiMzc4MmNlNDdiIiwidCI6ImM4ZWNhM2NhLTEyNzYtNDZkNS05ZDlkLWEwZjJhMDI4OTIwZiIsImMiOjl9

COVID19 Vaccination Data

COVID19 Vaccination Data

India

{
    "_id": "2021-01-20T09:00:00.00+05:30|vaccinations",
    "_rev": "1-c557793c024b73c7870406896138f504",
    "report_time": "2021-01-20T09:00:00.00+05:30",
    "total": 674835,
    "source": "mohfw",
    "type": "vaccinations"
}

Karnataka

Its gettting published as part of bulletin here

Looking for the following ICMR press releases

District-wise COVID-19 test positivity rates - Older Documents

MoHFW is adding "District-wise COVID-19 test positivity rates" as an excel everyday. I noticed it yesterday, So I have it for 9th and 10th of June.

I don't know when they started. We will have to get the old ones. But they seem to remove the old ones. For example 9th June xls is not accessible anymore.

For 10th June its

https://www.mohfw.gov.in/pdf/COVID19DistrictWisePositivityAnalysis10thJune.xlsx

Currently I am not parsing the XLS, I am just archiving them. They are available here

https://github.com/datameet/covid19/tree/master/downloads/mohfw-backup/district_wise_positivity_rates

If you have the older documents, can you send me a MR?

Load District-wise COVID-19 test positivity rates

JSON doc format could be.

{
  "_id": "2022-01-22T09:00:00.00+05:30|positivity-rates|wikidataId",
  "report_time": "2022-01-22T09:00:00.00+05:30",
  "district":"wikidataId/<wikidataId>",
  "positivity": null,
  "rtpcr_contribution": null,
  "rat_contribution": null,
  "source": "icmr",
  "type": "district-wise-weekly-positivity-rates"
}
  • report_time also is the end of the week for the data point. Do we need start date?
  • is wikidataId a better way to represent the district. Other option local government id

1 error in data

There is an entry in the non-virus deaths that has the date '2010-04-12' it is probably '2020-04-12'.

I sorted the data according to dates and that's why that popped out. Thanks ๐Ÿ˜„

Symbols for states

Thank you for this wonderful piece of work. It is really good to have a source of data which truly represents what the government agencies put out.

The mohfw.json uses 2-letter words to represent names of states. These keys are however never explained. For example, does 'la' point to Lakshadweep or does it point to "Ladakh"? From the ministry, it is apparent that 'la' means Lakshwadeep. And, it appears that in your dataset, 'la' means Ladakh. I hope you can also provide a document which maps the 2-letter state codes to actual state names.

Loading Vaccination State Data

Hi All,
If you follow the datameet/covid19[1] Github repo, you would have seen that we download[2] the cumulative covid vaccination report daily. As of now, we are extracting only Total Doses at the India level and adding to the JSON[3]. The data looks like this, for each day.

	{
	  "_id": "2021-09-12T09:00:00.00+05:30|vaccinations",
	  "report_time": "2021-09-12T09:00:00.00+05:30",
	  "total":738207378,
	  "source": "mohfw",
	  "type": "vaccinations"
	}

Now, I have written a script to extract other parts of the PDF and the state level. The dataset will be backward compatible( shouldn't break any of your data pipelines). It will look like this at India level. Two additional attributes, "1stdose" and "2nddose".

{
  "_id": "2021-09-12T09:00:00.00+05:30|vaccinations",
  "report_time": "2021-09-12T09:00:00.00+05:30",
  "total":738207378,
  "1stdose":561101965,
  "2nddose":177105413,
  "source": "mohfw",
  "type": "vaccinations"
}

There will be new records at the state level, which will look like this.


{
  "_id": "2021-09-12T09:00:00.00+05:30|vaccinations|ka",
  "report_time": "2021-09-12T09:00:00.00+05:30",
  "state": "ka",
  "total": 47445632,
  "1stdose":35196111,
  "2nddose":12249521,  
  "source": "mohfw",
  "type": "vaccinations"
}

for Unassigned or Miscellaneous state will be

{
  "_id": "2021-09-12T09:00:00.00+05:30|vaccinations|unassigned",
  "report_time": "2021-09-12T09:00:00.00+05:30",
  "state": "unassigned",
  "total": 3458791,
  "1stdose":1556469,
  "2nddose":12249521,  
  "source": "mohfw",
  "type": "vaccinations"
}

Currently, I have been parsing and loading the old data (since 2021-03-08). It should be available by this weekend.

Once this is done. I will look into parsing and loading the District wise positivity rates.

You can follow the progress of this data load here.

[1] https://github.com/datameet/covid19/

[2] https://github.com/datameet/covid19/tree/master/downloads/mohfw-backup/cumulative_vaccination_coverage

[3] https://github.com/datameet/covid19/blob/master/data/mohfw_vaccination_status.json

Where can I find the test positivity rate?

Hi, I noticed that Our World in Data points to this repository for data on India's test positivity rate. Do you know where I can find this data here? I was able to find the total number of tests provided by ICMR, but not the number or % of tests that are positive. Is this calculated by combining MOHFW data on the number of cases with the ICMR testing numbers? Thanks.

Issue while cloning

I faced the following error while cloning the repo on my laptop
OS - Windows 11

image

The clone is successful after the git restore command in the above image.

Incorrect ICMR testing numbers for 2020-11-06

Just a heads up, the cumulative testing numbers for the 6th of Nov are incorrect in the icmr_testing_status.json โ€“ 20864750 instead of the 11,54,29,095 in the ICMR report.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.