Git Product home page Git Product logo

Comments (1)

ardunn avatar ardunn commented on August 19, 2024

So after examining this a bit, it seems the breakdown is this:

  • Reading the data file into a dictionary: 58% of the time
  • Creating the Datapath object (from_dict): 42% of the time

By simply not adding in the raw data to the object, we can reduce the "Creating the Datapath object" by about 78%, AKA reducing the total load time by 31%. Source: loading the structured big test file (250MB) on laptop repeatedly

This simple fix might also take care of the memory issues.

I'm not sure the best way to programmatically not load the raw file from disk into dictionary; one way around it would be changing the defaults so that raw data is not saved unless specified, then the current BEEPDatapath code would automatically ignore the data and the run would be marked as "legacy". I'll update my PR with some timing results.

Edit:

Decided to just not save raw data when using CLI by default. It's the easiest and cleanest solution, requiring no changes to as/from_dict. See #250

from beep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.