Comments (9)
Nice comparison! Interesting that even with a pretty large rolling window the sources do not match, due to the heavy weekend effect in the RKI data. It seems likely that JHU logs announcement date (i.e. they scrape what appears in the state press releases, with data that is presumably processed continuously), while RKI logs true testing date, and thus the dip is due to less people getting tested on weekends. We should be able to confirm that with daily number of tests performed, but afaik the RKI only gives weekly numbers.
from covid-19-germany-gae.
By now I'm used to the weekly "spread has slowed down" news every Monday =P
I guess a bit of tea leave reading is to be expected in such a situation.
You are right, the Meldedatum is the reported date. Fortunately, The RKI added a Refdatum field to the query system with what I assume is the self-reported start of symptoms: https://npgeo-corona-npgeo-de.hub.arcgis.com/datasets/dd4580c810204019a7b8eb3e0b329dd6_0?selectedAttribute=Refdatum
Meldedatum: Datum, wann der Fall dem Gesundheitsamt bekannt geworden ist
Referenzdatum: Erkrankungsdatum bzw. wenn das nicht bekannt ist, das Meldedatum
It paints quite a different picture:
Refdatum (certain) plots only the data where Meldedatum > Refdatum. Unfortunately, we lose about 30% of the entries and 40% of the cases, and the undersampling is very much non-random. So the true start of symptoms timeseries lies somewhere between Refdatum (certain) and Refdatum.
It is also interesting to consider that in light of the (unfortunately only weekly) number of tests performed:
So one interpretation is that the ramping up in tests after 15/03 managed to catch the earlier unreported cases.
from covid-19-germany-gae.
@joaopn just some quick meta feedback here: thanks for this exchange of ideas!
I've also had a quick look at the work of your research group and I truly appreciate it. Btw, I am a physicist myself; also had deep contact to MPI-PKS in Dresden during my PhD -- truly appreciate the role of the institute you're working in :-).
from covid-19-germany-gae.
Thanks for the kind feedback. You are absolutely right about the relevance of the derivatives here. Will certainly look into that.
from covid-19-germany-gae.
@joaopn I spent a bit of time on that: https://covid19-germany.appspot.com/
from covid-19-germany-gae.
This is a hopefully thorough rate calculation with a subsequent rolling window analysis.
Code:
I'll have to re-check the method and compare numbers to be sure, but I think this is already pretty meaningful for comparing sources.
from covid-19-germany-gae.
the heavy weekend effect in the RKI data
Heavy, yes! That weekend effect alone is a really important observation. The entire world goes crazy with quantitative analysis of case count numbers. In Germany, this weekend effect can be used as a great example to corroborate "hey, these data need be interpreted with great care!" :-).
I think that qualitative statements about the dynamics of the virus spread as well as about the nature of the disease are largely sufficient for estimating risk and for making decisions. But of course for our media that's not convincing enough and they have come up with quantitative analyses (which they should better leave to experts, like your research group).
For example, heute journal (ZDF) plots a "Verdopplungszeit" over time:
This is IMO a pretty severe example for over-quantification... super dubious because the concept of the Verdopplungszeit makes most sense in the context of exponential growth. And then of course the weekend effect makes this look even more stupid. And what was most horrifying: the 9.6 (of course, determined to that precision) were "close" to the 10 that Merkel once declared to be a goal (ZDF claimed: we almost reached the goal, only 0.4 left to go!)
In view of the weekend effect I think we should sarcastically apply a Fourier transform to "show" that the virus is actually a religious entity (because -- clearly -- it operates with a 7-day periodicity and rests on weekends, RIGHT?) :-).
from covid-19-germany-gae.
while RKI logs true testing date, and thus the dip is due to less people getting tested on weekends
It's actually the "Meldedatum" which sadly is not the actual testing date, but the date of the day that a Gesundheitsamt learns about a new case. The test might have been performed days before that. I am also rather sad about that we miss that part of the timeline for each individual case.
Für die Darstellung der neuübermittelten Fälle pro Tag wird das Meldedatum verwendet – das Datum, an dem das lokale Gesundheitsamt Kenntnis über den Fall erlangt und ihn elektronisch erfasst hat.
Der genaue Infektionszeitpunkt der gemeldeten Fälle kann in aller Regel nicht ermittelt werden. Das Meldedatum an das Gesundheitsamt spiegelt daher am besten den Zeitpunkt der Feststellung der Infektion (Diagnosedatum) und damit das aktuelle Infektionsgeschehen wider.
from covid-19-germany-gae.
@jgehrcke Thanks for the kind words =)
from covid-19-germany-gae.
Related Issues (20)
- RKI data: death rate seems to be bogus; dropping towards 0 HOT 1
- DEU variants HOT 2
- Risklayer Deutschland Zahlen
- No RKI updates since 4 days ... (Landkreis 16056 disappeared from RKI data set) HOT 16
- generate-latest-aggregate.py: KeyError: '16056' HOT 1
- Keine Updates mehr? Schade. HOT 3
- Double-check population count of Landkreis 16063
- Update Fehler HOT 6
- Is there a detailed description of the raw RKI_COVID19.csv -- AnzahlFall 131 ? HOT 6
- Data updates keep overwriting the newest line instead of appending HOT 2
- Feature Request: Single CSV-File with current state HOT 28
- auto update fails as of arcgis system downtime HOT 2
- No updates since 27.01.2023 HOT 3
- change heatmap scale to be absolute
- Discrepancy to RKI data of yesterday while today is accurate. HOT 7
- Time Shift in Gehrcke VS RiskLayer HOT 4
- Discrepancy in Deaths - Gehrcke vs. Risklayer HOT 5
- RKI data update: currently broken because of AGS 9178/9179 history damage in Covid19_RKI_Sums ArcGIS feature server HOT 1
- Potential error in calculation of daily deaths HOT 4
- cases-rki-by-ags.csv - implausible data for entity 8126 - Hohenlohekreis HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from covid-19-germany-gae.