Comments (5)
Our NICE pipeline broke, so if you do have a nice workflow for this, please share or let me know.
Nice.
from coronawatchnl.
Nice app! We would love to list it on our page :)
I've also got quite a lot of more structured data by parsing the RIVM javascript, that way i don't have to deal with the PDF files. I also process the data from NICE and LCPS.
Can you clarify what else you collected from RIVM?
And I'm curious what data you collect from LCPS.
Also, anybody notice that the chart "Aantal overledenen naar datum van overlijden" at https://www.rivm.nl/coronavirus-covid-19/grafieken has the value 109 for the 25th of march and in the pdf it shows 108 for that date. The total used by RIVM is the total in the PDF, so i assume that's the correct total. I noticed they have a one off error everyday for that date...
yup...
Our NICE pipeline broke, so if you do have a nice workflow for this, please share or let me know.
Best Jonathan
from coronawatchnl.
Nice app! We would love to list it on our page :)
Sure, no problem.
It's running on the Firebase Hosting CDN, so it should always be online.
I try to update the data somewhere around 14:30 and again later for the LCPS info.
I've also got quite a lot of more structured data by parsing the RIVM javascript, that way i don't have to deal with the PDF files. I also process the data from NICE and LCPS.
Can you clarify what else you collected from RIVM?
I'm pretty sure you also process the .js files from RIVM, when i wrote that i thought you somehow processed the PDF files.
And I'm curious what data you collect from LCPS.
I use puppeteer to render the full html and extract the LCPS info per day.
The result is this:
Date,number of patients on IC,delta
2020-04-14,1303,-35
2020-04-13,1338,-20
2020-04-12,1358,-33
2020-04-11,1391,7
2020-04-10,1384,-33
2020-04-09,1417,9
2020-04-08,1408,-16
2020-04-07,1424,15
2020-04-06,1409,24
2020-04-05,1385,25
2020-04-04,1360,36
Also, anybody notice that the chart "Aantal overledenen naar datum van overlijden" at https://www.rivm.nl/coronavirus-covid-19/grafieken has the value 109 for the 25th of march and in the pdf it shows 108 for that date. The total used by RIVM is the total in the PDF, so i assume that's the correct total. I noticed they have a one off error everyday for that date...
yup...
Our NICE pipeline broke, so if you do have a nice workflow for this, please share or let me know.
I've got a combination of JAVA files that download, extract and convert all the data.
But it's not really a better pipeline, and i guess it works because i run it from my home(i read somewhere here that NICE blocks all IP addresses outside NL?)...
I tried running some scripts from Scaleway (a cloud provider), but i guess RIVM also blocks those ip's.
The best i can do (when i have the time) is to upload the NICE json files and a CSV file based on the LCPS.nu news section where they publish their numbers.
I also keep a complete daily archive of all the data i download, including all the PDF files from RIVM since march 23rd.
Best Jonathan
Best regards,
Jeroen
from coronawatchnl.
Thanks for this answer.
I'm pretty sure you also process the .js files from RIVM, when i wrote that i thought you somehow processed the PDF files.
Yes, we do. There are a couple of scripts available in this repo. But the structure of the PDF's is very poor, so a lot of manual editing is required... Feel free to extract the data from this repo. It saves you a lot of time.
The best i can do (when i have the time) is to upload the NICE json files and a CSV file based on the LCPS.nu news section where they publish their numbers.
NICE would be great. I will add LCPS asap.
I also keep a complete daily archive of all the data i download, including all the PDF files from RIVM since march 23rd.
I think we are missing March 23rd and 24th. Can you share them?
Thanks a lot for your help in improving the data.
Best Jonathan
from coronawatchnl.
Thanks for this answer.
I'm pretty sure you also process the .js files from RIVM, when i wrote that i thought you somehow processed the PDF files.
Yes, we do. There are a couple of scripts available in this repo. But the structure of the PDF's is very poor, so a lot of manual editing is required... Feel free to extract the data from this repo. It saves you a lot of time.
I meant that i use .js files to parse the data they show in the graphs. For a couple of days i used Tabula to parse the PDF files, but that was too cumbersome. After that they published the graphs and those contained the data i needed, so i stopped using the PDF files.
The best i can do (when i have the time) is to upload the NICE json files and a CSV file based on the LCPS.nu news section where they publish their numbers.
NICE would be great. I will add LCPS asap.
I've setup a script that will publish the NICE json files at https://corona-map-nl-archive.web.app/
It runs at 13:50 CEST.
If you go there now, it shows the current date. If you click on the date, you get an index.html with a list of files that are available. The filenames are consistent, the only thing that changes is the date in the directory (https://corona-map-nl-archive.web.app/data/2020-04-16/global.json etc)
Let me know if this works (or doesn't work) for you.
I also keep a complete daily archive of all the data i download, including all the PDF files from RIVM since march 23rd.
I think we are missing March 23rd and 24th. Can you share them?
done(i saw you merged the pull request)
Thanks a lot for your help in improving the data.
sure, no problem.
Best Jonathan
Best Jeroen
from coronawatchnl.
Related Issues (20)
- Contagious data not up to date HOT 4
- Extract Settings data from report to JSON HOT 2
- Ziekenhuisopname empty in RIVM data HOT 4
- Information on infection count in healthcare workers
- Number of tests performed in RIVM report
- R package and new datasets
- Slight differences between case count datasets
- Total number of tests per week deviates from sources HOT 5
- Negative values municipal dataset for number of diagnosed/hospitalized/deceased cases
- LCPS now publishes a CSV with it's data HOT 2
- GGD no longer publishing tests performed per week HOT 3
- data-geo no longer updating HOT 1
- RIVM_NL_age_distribution no longer updating since 09-12-2020
- RIVM_NL_test_latest wrong data starting on 2020-12-14
- Vaccination data from dashboard HOT 1
- Missing data municipalities
- Weekly corona-test numbers have stopped
- National data doubled
- National data incomplete
- Missing municipality Eemsdelta
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from coronawatchnl.