Git Product home page Git Product logo

Comments (4)

jkoeller avatar jkoeller commented on August 19, 2024 1

I just let it run for a while, and you're right @patrickherring-TRI it does eventually process after not appearing to make any progress for a while. I was able to process all of the files with an average time of 15 min / file. Interestingly, the RAM utilization (at least in the beginning, while appearing not to progress) is quite low. Not sure what the slowdown is. The files of size < 160 MB process in around 1 min or so.

Thanks for looking into this!

from beep.

patrickherring-TRI avatar patrickherring-TRI commented on August 19, 2024

@jkoeller Hi, thanks for reporting this!
I tried out a couple of the files that you showed above and it looks like the structuring succeeds although it does take longer than I would expect. I will look into what process is taking up so much CPU time.

One thing that might be happening is an out of memory issue causing your kernel to crash. Generally, the structuring process takes ~5x the memory footprint of the raw file (pandas being what it is). So a 300MB file might require ~2GB of RAM to complete successfully.

After structuring, the structured file has a considerably smaller memory footprint and is much easier to work with.

from beep.

ardunn avatar ardunn commented on August 19, 2024

@patrickherring-TRI some of these memory issues might be fixed by using in place operations for the dataframes. For example, in RawCyclerRun.get_interpolated_data() the dataframe seems to be copied in memory several times

from beep.

patrickherring-TRI avatar patrickherring-TRI commented on August 19, 2024

This delay appears to be due to the initial data loading process which for large files can be resource intensive. New pandas functions might be able to speed up this process, but a better solution will be the larger refactor of the structuring module. Closing this issue for now since it is system dependent.

from beep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.