Git Product home page Git Product logo

Comments (3)

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on August 14, 2024

The timings are misleading in this example. You've built the matrix with setElement, which is the slower way to do it, and it also leaves the matrix as a pile of unsorted tuples. If you want to time just the dup, you would need to put a GrB_wait outside the timer, first. Otherwise, GrB_Matrix_dup must first call GrB_wait itself on the matrix, which is doing the work of GrB_Matrix_build on the unsorted tuples from setElement.

Serialization is likely the best way to send a matrix across a network. That's what it's designed for. An alternative would be to use no compression at all, which leads to a faster serialize time, but requires more bytes to send. Another alternative would be to use GxB_Matrix_unpack to unpack the matrix in O(1) time, transmit the pieces, and then GxB_Matrix_pack it again.

from graphblas.

victorstewart avatar victorstewart commented on August 14, 2024

magic!!! i knew about needing to wait for operations to execute of course but i didn't realize it applied to setElement as well.

this order of magnitude drop in time cost buys me a lot of time until i have to worry about this problem again. thanks so much.

i see now that the serialization without compression is essentially an allocate + memcpy as well. so i'll just copy the matrix on the "main thread" then serialize it with GxB_COMPRESSION_ZSTD on a worker thread.

the memory consumption of the matrix also fell to 37.6% of the original after GrB_MATERIALIZE. so that's only ~8 bytes per uint8_t element instead of the ~22 bytes i assumed before. so now the memory size differential between the serialized matrix and the GrB_Matrix is on the order of 2.8x whereas before it was on the order of 7.4x, so ~2.7x smaller. so much easier to justify overcommitting the database's memory to account for this transient memory need of copy then compress than before.

i need to preserve the matrices so the unpack way isn't an option, but its time cost would be the same as these other ways.

root@clr-b5df9984821e4d129387e172044f5754~ # ./graphreplication.test
milliseconds to GrB_MATERIALIZE: 3354ms
1000000x1000000 matrix with 100000000 elements consumes 770.5MB
milliseconds to allocate + copy the matrix: 76ms
milliseconds to allocate + serialize the matrix with compression: 776ms
size of serialized compressed matrix: 275.6MB
milliseconds to allocate + serialize the matrix NO compression: 72ms
size of serialized matrix without compression: 770.5MB

from graphblas.

victorstewart avatar victorstewart commented on August 14, 2024

circling back on this. i realized the optimal solution is to fork the process to get copy-on-write matrices, and then serialize on that new process. this requires no downtime and the absolute minimal extra memory cost.

from graphblas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.