Git Product home page Git Product logo

covid19-public's Introduction

Open data on COVID-19 in Malaysia

The scope and granularity of data in this repo will evolve over time.

All data is correct as of 2359 of date, unless stated otherwise.


Cases and Testing

  1. cases_malaysia.csv: Daily recorded COVID-19 cases at country level.
  2. cases_state.csv: Daily recorded COVID-19 cases at state level.
  3. clusters.csv: Exhaustive list of announced clusters with relevant epidemiological datapoints.
  4. tests_malaysia.csv: Daily tests (note: not necessarily unique individuals) by type at country level.
  5. tests_state.csv: Daily tests (note: not necessarily unique individuals) by type at state level.

Healthcare

  1. pkrc.csv: Flow of patients to/out of Covid-19 Quarantine and Treatment Centres (PKRC), with capacity and utilisation.
  2. hospital.csv: Flow of patients to/out of hospitals, with capacity and utilisation.
  3. icu.csv: Capacity and utilisation of intensive care unit (ICU) beds.

Deaths

  1. deaths_malaysia.csv: Daily deaths due to COVID-19 at country level.
  2. deaths_state.csv: Daily deaths due to COVID-19 at state level.

Vaccinations

  1. vax_malaysia.csv: Vaccinations (daily and cumulative, by dose type and brand) at country level.
  2. vax_state.csv: Vaccinations (daily and cumulative, by dose type and brand) at state level.
  3. vax_district.csv: Vaccinations (daily and cumulative, by dose type and brand) at district level.
  4. vax_school.csv: Vaccination coverage for public schools.
  5. vax_demog_age.csv': Vaccinations by age group, at district level.
  6. vax_demog_age_children.csv': Vaccinations by age group with single-year granularity for individuals < 18yo, at district level.
  7. vax_demog_sex.csv': Vaccinations by sex, at district level.
  8. vax_demog_ethnicity.csv': Vaccinations by ethnicity, at district level.
  9. vax_demog_nationality.csv': Vaccinations by nationality, at district level.
  10. vax_demog_highrisk.csv': Vaccinations for special categories (healthcare workers, OKU, individuals with comorbidities) at district level.

Mobility and Contact Tracing

  1. checkin_malaysia.csv: Daily checkins on MySejahtera at country level.
  2. checkin_state.csv: Daily checkins on MySejahtera at state level.
  3. checkin_malaysia_time.csv: Time distribution of daily checkins on MySejahtera at country level.
  4. trace_malaysia.csv: Daily casual contacts traced and hotspots identified by HIDE, at country level.

Static data

  1. population.csv (last updated from DOSM 2020 census, as published in 2022):
  • idxs: integer coding for states (employed in cases linelist, cluster file, and school vax file)
  • pop: total population (all other columns are subset of pop)
  • pop_18: population aged 18+
  • pop_60: population aged 60+, also a subset of pop_18
  • pop_12: population aged 12-17
  • pop_5: population aged 5-11

Static data will remain unchanged unless there is an update from the source, e.g. if DOSM makes an update to population estimates. We provide this data here not to supersede the source, but rather to be transparent about the data we use to compute key statistics e.g. the % of the population that is vaccinated. We also hope this ensures synchronisation (across various independent analysts) of key statistics.

covid19-public's People

Contributors

adibzter avatar agnes-lyy avatar aidilsfwn avatar amirmazmi avatar atlas-github avatar khoohaoyit avatar leeliwei930 avatar lekannao avatar moh-malaysia avatar patrickxchong avatar sameu-cloudtech avatar seowwj avatar syafix19 avatar thomassiew avatar timriffe avatar weareblahs avatar wnarifin avatar zukelah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covid19-public's Issues

Cluster End Date

Can we get data cluster end date.Or if possible consider last updated timestamp case total equal with recovered?

Actual daily positive test numbers

Request for actual daily positive test numbers from tests in tests_malaysia.csv.
Ideally split according to type of tests.

This data will most likely be delayed by a few days, which is fine. Definitely better than making the false assumption that positive cases reflect the positive rate.

Intention here is to link mobility data to positive test cases.

ICU data read through

Hi, quick check on the ICU data, the aggregate daily icu_covid is far more higher than the daily announced ICU cases by MKN. Not sure how to read through these inconsistencies, anyone can help? Much appreciated.

Demographic breakdown?

Excellent initiative, many thanks for posting this data. I wonder if it would be possible to offer tabulations of cases, deaths, tests, and vaccinations by age groups and sex breakdowns? This would be valuable information for making international comparisons.
Many thanks,
Tim Riffe

hosp_x in hospital.csv

What is the description for these columns in hospital.csv? I'm sorry if this has been clarified but I can't seem to find it anywhere. Thanks.

Request MySejahtera Checkins by State

We would like to request a granular data from MySejahtera on Checkins by State and District. Currently only the National Level of the MySejahtera and HIDE data points are available.

These data will also show the pattern of clusters vs movements.
Or the use of MySejahtera vs Non MySejahtera recorded clusters.

Thank you

Number of test

Could you include number of test made per day, instead of just number of positive, etc

checkin_malaysia_time.csv outliers, potentially aggregation issue

Hi,

For (checkin_malaysia_time.csv), is there any issue with the data count on the following time-density date

Day 31st May 2021, time density 15 until 21. The precendent and subsequent counts from this time are also strangely low.
I was suspecting that the aggregation wasn't correct around.

Thanks

image

image

Total registered users for Mysejahtera per day

Requesting a timeline of mysejahtera uptake in the form of total registered users per day.
Please confirm if this data is available.

Will list this in CONTRIB.MD and submit pull request once confirmed that the data is available.

Thanks.

Clarification on total vaccination registration & total population dataset

Hello MoH, I would like to clarify the numbers of total registered vaccination (vaxreg_state.csv) & numbers of total population (population.csv). The reason being is that in some dates & states, the former exceeds the latter

For example, on 23rd July 2021, total number of registered vaccination for W.P. Kuala Lumpur is 1,830,956. While the total population of W.P. Kuala Lumpur is 1,773,700

Can I know the reason of total number of registered vaccination being higher than total number of population? Based on README.md provided, it seems like total is calculated based on number of unique registrants

Thank you!

Request daily positive by age

Dear author,
Great data. May we request data for daily positive broken down by age or age group?
This will be crucial or important to check the age composition and find how to protect each age group based on activites.

Best

tests_malaysia.csv - Column label has hidden TAB chars

File: tests_malaysia.csv

Problem: Label rtk-ag and pcr has a hidden TAB character before it

Raw file snippet:

date, rtk-ag, pcr
2020-01-24,0,2
2020-01-25,0,5

ASCII Decode:

rtk-ag -> 009 114 116 107 045 097 103 (TAB R T K - A G)

Request for Additional Data Points - Daily Cases Breakdown by Category (1 to 5)

With the vaccination population going up and that we all know that vaccination will lessen the severity of the covid-19 patient... it will be very helpful to now
(1) have the breakdown by Cat 1 - asymptomatic to Cat 5 - classifications provided.
(2) within each Category, a further breakdown by Not Vaccinated, 1st Dose, and Fully Vaccinated

This will serve as another powerful communication message to general public who still NOT register for vaccination

Corrupted/missing data in Mysejahtera 27-31 May?

For checkin_malaysia_time.csv date range 2021-05-27 to 2021-05-31 the numbers are illogically low (bucket 20 or 10:00am onwards). Doesn't seem to be app/server issue, as the numbers are back to normal again at 00:00 each next day.

mysj-bucket-20210527

pkrc.csv - Column name typo

File: pkrc.csv

Problem: Column names are mistyped
Affected: discharge_pui, discharge_covid, discharge_total

Comments:
Based on epidemic/README.md, the columns should be labeled discharged_x. This would also be more consistent with the hospital.csv data columns.

Test Cases not being updated for the past 2 days

Hi, just wondering is it intentional that the test cases isn't being updated for the past 2 days? I'm doing a graph about it and just realized it's not being updated recently. Is there any change of test cases being planned or the update being halted?

Seeking clarification for unique_ind on MySejahtera data

Hi there, unique_ind is defined as the "number of unique accounts which checked in", may I know is it based on the number of check-in on location?

If one person check-in in three separate locations today (check-in twice in one of the locations), what is the unique individual count?

For example, if I check in once at supermarket A, it is counted as one. And I go to a bank later, I check in another time. But if I go back to supermarket A and checking in again, does it counted as another new check-in?

Discrepancy in count icu.csv and press release

There are discrepancies between the total number of covid patient in ICU and under ventilation support calculated from the icu.csv and the ones reported in the press release (https://kpkesihatan.com).

For example, on 24/7/21, the sum for icu_covid column and vent_covid are 1397 and 799 respectively, while the number reported (https://kpkesihatan.com/2021/07/24/kenyataan-akhbar-kpk-24-julai-2021-situasi-semasa-jangkitan-penyakit-coronavirus-2019-covid-19-di-malaysia/) were 950 and 468 respectively.

Missing data in the pkrc.csv file

Hi there,
Noticed in the pkrc.csv file - there are no data included at all for "states" WP KL and WP Putrajaya?
Why is this so? I believe there are quarantine centres in both these territories.
Hope KKM is aware and will rectify this gap/ issue real soon.

Regards
Foo

All-Cause Mortality data

Great initiative!

Could you also provide deaths from all causes (not just COVID) by day/week/month?

Incorrect data on 16/3/2020

For state data on 16/3/2020, "covid19-public/epidemic/cases_state.csv", line number 2:16, the number marked as new cases are not new cases. These are cumulative cases, and by right left blank/NA.

The same goes to national data on 16/3/2020, "covid19-public/epidemic/cases_malaysia.csv", line 2, the reported new cases on the date is supposed to be 125, not 553, which is actually the cumulative cases by 16/3/2020.

checkin_malaysia_time.csv column unit

Dear author,

I have checked the file for checkin_malaysia_time.csv. Please state the unit for the column, it had 47 columns, does it mean it recorded the number of check ins for every 30 minutes?

Cheers,
Boo

Daily testing at state level

As of now, the daily testing data is given at national level "covid19-public/epidemic/tests_malaysia.csv". If possible, the data is also given at state level. This allows analysis of positivity rate by state.

Clarification on the definition for population.csv dataset

Need clarification on the column pop_18, the documentation mention 18+. Does this mean all aged ≥ 18 y.o. including ≥ 60 y.o. (60+)? Or only those aged between 18+ to 60+? Maybe can change the notation to use ≥ symbol be more descriptive.

Latest Data

covid19-public/epidemic/tests_malaysia.csv - Data is not up to date.

In cases_state.csv, there're two -ve values

Thanks for the data! would love to know if the negative values in cases_state.csv is real?

<style> </style>
date state cases_new
1/8/2021 W.P. Kuala Lumpur -101
5/19/2020 W.P. Putrajaya -1

image

Recovered Cases by State

How to get the daily numbers of recovered cases for each state? The discharged numbers do not match the total recovered cases for all of Malaysia combined.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.