Comments (5)
from py-ecomplexity.
Thanks Matias for your reply.
The data calculated by this package are normalized by eci. I can confirm that the mean of ECI is 0 and the standard deviation of ECI is 1.
The pre-calculated ECI from the dataverse data is not normalized but is close:
year | mean ECI | sd ECI |
---|---|---|
1995 | -0.0138702 | 0.9869372 |
1996 | -0.0172142 | 0.9702815 |
1997 | -0.0013751 | 1.0293460 |
1998 | -0.0184518 | 1.0017776 |
1999 | -0.0009565 | 0.9855839 |
2000 | 0.0250194 | 0.9951708 |
2001 | 0.0522257 | 0.9701210 |
2002 | 0.0458795 | 0.9585416 |
2003 | 0.0377355 | 0.9721496 |
2004 | 0.0383981 | 0.9639836 |
2005 | 0.0233821 | 0.9737128 |
2006 | 0.0611096 | 0.9824983 |
2007 | 0.0534595 | 0.9712855 |
2008 | 0.0498620 | 0.9636734 |
2009 | 0.0881478 | 0.9712107 |
2010 | 0.0712081 | 0.9720490 |
2011 | 0.0646669 | 0.9753816 |
2012 | 0.0383435 | 1.0044073 |
2013 | 0.0426753 | 1.0085014 |
2014 | 0.0353947 | 0.9829397 |
2015 | 0.0392669 | 0.9958343 |
2016 | 0.0422366 | 0.9964626 |
2017 | 0.0275077 | 0.9895860 |
2018 | 0.0539169 | 0.9958367 |
After normalizing, its close, but not identical:
In regards to any other data cleaning going on, I'm simply using the trade data which comes as part of the pre-calculated data from the dataverse. I don't see how there can be any differences between the two. The number of locations and products by year in the indicators calculated by this package are identical to the number of locations and products by year in the dataverse data.
year | locs.calculated | prods.calculated | locs.source | prods.source |
---|---|---|---|---|
1995 | 231 | 1247 | 231 | 1247 |
1996 | 227 | 1247 | 227 | 1247 |
1997 | 227 | 1247 | 227 | 1247 |
1998 | 226 | 1247 | 226 | 1247 |
1999 | 226 | 1247 | 226 | 1247 |
2000 | 231 | 1248 | 231 | 1248 |
2001 | 233 | 1248 | 233 | 1248 |
2002 | 234 | 1248 | 234 | 1248 |
2003 | 233 | 1248 | 233 | 1248 |
2004 | 234 | 1248 | 234 | 1248 |
2005 | 233 | 1247 | 233 | 1247 |
2006 | 232 | 1248 | 232 | 1248 |
2007 | 233 | 1247 | 233 | 1247 |
2008 | 233 | 1247 | 233 | 1247 |
2009 | 233 | 1247 | 233 | 1247 |
2010 | 233 | 1245 | 233 | 1245 |
2011 | 235 | 1245 | 235 | 1245 |
2012 | 235 | 1246 | 235 | 1246 |
2013 | 237 | 1243 | 237 | 1243 |
2014 | 236 | 1242 | 236 | 1242 |
2015 | 235 | 1241 | 235 | 1241 |
2016 | 234 | 1240 | 234 | 1240 |
2017 | 236 | 1227 | 236 | 1227 |
2018 | 236 | 1225 | 236 | 1225 |
from py-ecomplexity.
from py-ecomplexity.
I'll just add that the ECI calculated using the R package referenced in #11 does agree with the ECI calculated using this python package. So perhaps something different is being done to the data on the dataverse?
from py-ecomplexity.
Sorry for the super-late response @hamgamb , but If anyone else is looking for some answers here, the short but possibly unsatisfying answer is that there is more data pre-processing that goes into the dataverse. The ultimate algorithms used to generate the PCI / ECI values are the same, and the differences you rightly call out are a result of the data preprocessing. If you reach out to the team that manages the data uploaded on the dataverse (atlas.cid.harvard.edu), they might be able to offer you exact details of the pre-processing.
from py-ecomplexity.
Related Issues (20)
- Improve performance using numba HOT 1
- this has an R version here HOT 4
- Explicitly specify "name" in MultiIndex.from_product HOT 1
- Help with the subnational data HOT 1
- Is COG equal to OG? HOT 1
- RCA calculation HOT 2
- PCI calculation. HOT 3
- I got confusing ranking results from the sample code HOT 4
- RPOP - Handling zeros in diversity / ubiquity HOT 2
- Possible mistake in the code of ecomplexity.py HOT 2
- PCI normalization using wrong mean and standard deviation. HOT 3
- Implement knn for density calculations
- Add log-supermodularity checks and warnings if non-conformant HOT 1
- Allow for custom proximity matrix HOT 1
- Install problem and some question HOT 1
- Fail to install HOT 1
- ECI correlation with diversity HOT 4
- Why are the eci computed with ecomplexity different from their given values? HOT 5
- name 'ecomplexity' is not defined HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from py-ecomplexity.