Git Product home page Git Product logo

Comments (5)

matuteiglesias avatar matuteiglesias commented on June 25, 2024

from py-ecomplexity.

hamgamb avatar hamgamb commented on June 25, 2024

Thanks Matias for your reply.

The data calculated by this package are normalized by eci. I can confirm that the mean of ECI is 0 and the standard deviation of ECI is 1.

The pre-calculated ECI from the dataverse data is not normalized but is close:

year mean ECI sd ECI
1995 -0.0138702 0.9869372
1996 -0.0172142 0.9702815
1997 -0.0013751 1.0293460
1998 -0.0184518 1.0017776
1999 -0.0009565 0.9855839
2000 0.0250194 0.9951708
2001 0.0522257 0.9701210
2002 0.0458795 0.9585416
2003 0.0377355 0.9721496
2004 0.0383981 0.9639836
2005 0.0233821 0.9737128
2006 0.0611096 0.9824983
2007 0.0534595 0.9712855
2008 0.0498620 0.9636734
2009 0.0881478 0.9712107
2010 0.0712081 0.9720490
2011 0.0646669 0.9753816
2012 0.0383435 1.0044073
2013 0.0426753 1.0085014
2014 0.0353947 0.9829397
2015 0.0392669 0.9958343
2016 0.0422366 0.9964626
2017 0.0275077 0.9895860
2018 0.0539169 0.9958367

After normalizing, its close, but not identical:

image

In regards to any other data cleaning going on, I'm simply using the trade data which comes as part of the pre-calculated data from the dataverse. I don't see how there can be any differences between the two. The number of locations and products by year in the indicators calculated by this package are identical to the number of locations and products by year in the dataverse data.

year locs.calculated prods.calculated locs.source prods.source
1995 231 1247 231 1247
1996 227 1247 227 1247
1997 227 1247 227 1247
1998 226 1247 226 1247
1999 226 1247 226 1247
2000 231 1248 231 1248
2001 233 1248 233 1248
2002 234 1248 234 1248
2003 233 1248 233 1248
2004 234 1248 234 1248
2005 233 1247 233 1247
2006 232 1248 232 1248
2007 233 1247 233 1247
2008 233 1247 233 1247
2009 233 1247 233 1247
2010 233 1245 233 1245
2011 235 1245 235 1245
2012 235 1246 235 1246
2013 237 1243 237 1243
2014 236 1242 236 1242
2015 235 1241 235 1241
2016 234 1240 234 1240
2017 236 1227 236 1227
2018 236 1225 236 1225

from py-ecomplexity.

matuteiglesias avatar matuteiglesias commented on June 25, 2024

from py-ecomplexity.

hamgamb avatar hamgamb commented on June 25, 2024

I'll just add that the ECI calculated using the R package referenced in #11 does agree with the ECI calculated using this python package. So perhaps something different is being done to the data on the dataverse?

from py-ecomplexity.

shreyasgm avatar shreyasgm commented on June 25, 2024

Sorry for the super-late response @hamgamb , but If anyone else is looking for some answers here, the short but possibly unsatisfying answer is that there is more data pre-processing that goes into the dataverse. The ultimate algorithms used to generate the PCI / ECI values are the same, and the differences you rightly call out are a result of the data preprocessing. If you reach out to the team that manages the data uploaded on the dataverse (atlas.cid.harvard.edu), they might be able to offer you exact details of the pre-processing.

from py-ecomplexity.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.