Git Product home page Git Product logo

Comments (10)

DTPOTO avatar DTPOTO commented on May 21, 2024 1

Also @DTPOTO ... why would there be more total positive tests in the reporting date methodology? Is that just due to the time of day the results are generated?

I agree NYC health dept should supply both data sets. The time of day has a minor impact, more so when you are using the Report-Date methodology. The reason why the Reporting date methodology has higher numbers is because you are focused on the current date (today). The data files are being restated by BACK-DATING. It's a little like the government revising last months unemployment number. The TOTAL number of cases are identical, it just when are they being reported. @joansobo demonstrated that the total cases were the same, and able to calculate a new REPORTED Cases by looking at the Case-Hosp-Deaths.csv over two different days. The issue is Daily Restatement. Getting the lasted version of Case-Hosp-Deaths.csv may be you best bet in terms of predictive modeling. I don't like either but we may get that clarity or better information in a timely way.

from coronavirus-data.

ptulin avatar ptulin commented on May 21, 2024

from coronavirus-data.

dmadeka avatar dmadeka commented on May 21, 2024

They dont seem to match for days though.

This is NYState's historical estimates vs NYC historical estimates.

from coronavirus-data.

ptulin avatar ptulin commented on May 21, 2024

they will not match until after 7 pm and only for a moment, then they will need to be corrected next day again

from coronavirus-data.

psylum avatar psylum commented on May 21, 2024

I don't think this is a timing issue. The data shown on the NYS site usually matches what is presented during a Governor briefing. For the past 3 days, the number shown for NYC during the noonish briefing has been ~2000 higher than the evening NYC number.

from coronavirus-data.

dmadeka avatar dmadeka commented on May 21, 2024

Im with @psylum - the afternoon numbers seem higher, and the implied growth rates are very different. Here's a bar chart from the NYC data (I took the last three points and added them to the 31st).

image

Where as on wiki, NYState has an implied growth rate of 13-10% over the last few days. There seem to be big differences

from coronavirus-data.

DTPOTO avatar DTPOTO commented on May 21, 2024

Hello all please review the Issue string started when the NYC Health Dept started to use GitHub as data storage for their WEB page ("Counts vary differently from Yesterday"). At the same time of switching to GitHub the Health Dept changed the reporting methodology. Using "Diagnosis Date" instead of "Reporting Date". I am sure that the State Health Department is stuck with just getting the "Reporting Date" because they are collecting from too many different sources. The City is now attempting to show the NEW cases as of the date-of-diagnosis. The original Diagnosis occurs when the doctor suspects the patient has the virus and orders the TEST. The Lab provides data on the Reporting-Date, the Lab results may take 3 to 14 days (OUCH). I have looked at the LAG time between Diagnosis Date and Reporting Date see here
I am using the level of Back-Dating Revisions as of the Diagnosis-Date as surrogate for Lab-Results Lag time. The cumulative graph suggest that 3 days back is under-reported by half and that 4 days back is under-reported by a 1/3rd. At the current lab turn-around rate it takes a week before you have a handle on today's real number.
This can be unsettling if you are only focused on yesterday's new cases. All new cases being reported is OLD news (coming from either the state or the city). The reality is using Diagnosis date may be the better method for predicting the APEX. But, changing the reporting methodology without adequate an explanation sows the seeds of distrust and certainly undermines everyone's predictive models.

@madeka @ptulin @psylum @mmontesanonyc

from coronavirus-data.

speedplane avatar speedplane commented on May 21, 2024

This is causing a whole lot of confusion. It’s the responsibility of NYC to make these data differences crystal clear, and to provide both sets of data.

from coronavirus-data.

speedplane avatar speedplane commented on May 21, 2024

Also @DTPOTO ... why would there be more total positive tests in the reporting date methodology? Is that just due to the time of day the results are generated?

from coronavirus-data.

mmontesanonyc avatar mmontesanonyc commented on May 21, 2024

Data from NYC and NYS will always be different for a number of reasons, including the time of day the dataset is cut, de-duplication procedures that differ between the agencies, and data cleaning and QA procedures.

from coronavirus-data.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.