Comments (7)
Snap - I've just been looking at the same thing. The issue is that different sets of postcodes get selected - check the displays in the existing documents. There's a very different set in the vicinity of Casey. In python the 10km radius is to the centroid, while the R version appears to be to the nearest point of the postcode. Either is fine, but lets go with the version that is easier to reproduce on both systems.
from geospatialstroke.
Oh yeah, they really are quite different. I'll incorporate that difference as well and update the code.
EDIT: Bingo! That's the origin of the difference. Code push impending ...
from geospatialstroke.
hmm... that commit reduced the load to Casey to the desired 13% or so, but also reduced the load to Dandenong so that Kingston is then around 60%. That means there are still other differences at play here.
from geospatialstroke.
The postcodes in the 2 sets weren't quite exactly matched, so the above commit fixed that. Results still don't agree too strongly, something like
Destination | R | python |
---|---|---|
CaseyHospital | 13.8 | 12.8 |
DandenongHospital | 31.2 | 37.9 |
KingstonHospital | 55.0 | 49.4 |
I suppose that's not too bad, but it would still be nice to have them somewhat closer ...
from geospatialstroke.
So the python code of @gboeing first estimates the stroke incidence per postcode based on the basic demographic data, which the R code does not do. The latest commit modifies the sampling scheme so that numbers of cases are scaled to the estimate incidence rates per postcode, resulting in ...
Destination | R | python |
---|---|---|
CaseyHospital | 11.5 | 12.8 |
DandenongHospital | 31.6 | 37.9 |
KingstonHospital | 56.9 | 49.4 |
from geospatialstroke.
That commit gets it to something like:
Destination | R | python |
---|---|---|
CaseyHospital | 15.9 +/- 0.9 | 12.8 |
DandenongHospital | 37.6 +/- 1.1 | 37.9 |
KingstonHospital | 46.5 +/- 1.3 | 49.4 |
Those values for R come from:
- Using an unweighted street network retaining only components useable for vehicular routing;
- Sampling a fixed number of random points directly from this network, and allocating those to postcode areas;
- Weighting the final case load by per-postcode estimates of stroke incidence.
The error estimates come from the code at the end of the README. Importantly, these error estimates themselves are not very reproducible, indicating that some portion of the R-vs-py differences must be presumed to arise from sampling effects alone. Because the python code was based on 1,000 random points in total, while the R code used that number per postcode (and there are 57 of those), the latter must be presumed more accurate here.
In contrast to the above values, a more realistic estimate from the R code is likely to come from using the weighted street network, and sampling addresses within each postcode, which corresponds to these values:
Destination | load |
---|---|
CaseyHospital | 13.8 |
DandenongHospital | 39.2 |
KingstonHospital | 46.9 |
from geospatialstroke.
Future ref for @gboeing: I'll dig more deeply in osmnx-vs-osmdata
- something I've been wanting to do for a while. Thanks @richardbeare for the incentive here.
from geospatialstroke.
Related Issues (17)
- Data sets for examples HOT 3
- Final steps HOT 30
- Python instructions for linux (mint/ubuntu) HOT 8
- Visualization consistency HOT 1
- API keys query HOT 9
- googleway & madpeck
- RehabCatchments HOT 1
- RehabCatchment - repeated nodes HOT 1
- The web site HOT 26
- Choropleth - stroke by postcode HOT 7
- Python setup for windows
- critical eye HOT 12
- Markers in tmap HOT 1
- dodgr automatically removing impassable routes for given wt_profile? HOT 3
- Merge branches sooner rather than later HOT 3
- Catchment basins HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geospatialstroke.