Git Product home page Git Product logo

Comments (7)

computron avatar computron commented on July 28, 2024

Also if you can confirm that (i) I listed the correct references and (ii) the datasets I restored are correct (not out of date or anything from your latest commit) that would be helpful

from matminer.

kylebystrom avatar kylebystrom commented on July 28, 2024

Yeah I'll look into this today and make those updates. I'll see if I can make the dataframes the exact same (i.e. the same columns). Sorry I didn't get the tests and references done when I first wrote up the datasets pacakge; I was rushing a bit too much.

Also, one of the reasons git lfs is useful (and why we will probably need a new repo or data storage repo if we do a bunch of big datasets) is that making big changes to big files can make the git log very large (e.g. every time I commit changes to the CSV files I could be changing like a MB of data, which all has to go into the git log). This is fine for just these three datasets if we can get them the way we want them and leave them that way, but we might want to get rid of the git log for those datasets to save space once we finalize them.

from matminer.

computron avatar computron commented on July 28, 2024

Hi Kyle

Sure, we can use something like git BFG to clean out the logs later. I'll probably want to do this for some of the IPython notebooks too since those are really the biggest culprit. For now I think it's OK.

from matminer.

kylebystrom avatar kylebystrom commented on July 28, 2024

It looks like the citations and data are good to go, except that you listed the dielectric paper as being 2017 when the link I originally downloaded it from says 2016. I'll fix that when I add the additional columns

from matminer.

kylebystrom avatar kylebystrom commented on July 28, 2024

Hi @computron ,

I pulled the useful descriptors out of the meta dictionary for the piezoelectric and dielectric datasets, and I fixed the date for the dielectric citation. Everything should be up to date, but I saved an exact copy of the previous datasets in case something gets messed up.

from matminer.

computron avatar computron commented on July 28, 2024

Thanks! I updated the unit tests

from matminer.

kylebystrom avatar kylebystrom commented on July 28, 2024

Ah shoot sorry, I updated them too but forgot to push them. Thanks!

from matminer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.