Git Product home page Git Product logo

Comments (11)

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on September 17, 2024 2

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on September 17, 2024

from graphblas.

eriknw avatar eriknw commented on September 17, 2024

Aha, thanks, I should be able to make that work!

from graphblas.

eriknw avatar eriknw commented on September 17, 2024

I'm curious, how do GxB*import* and GxB*export* handle zombie bits? Is it always safe to go to/from standard CSC/CSR representation outside of GraphBLAS (such as if you wanted to export these to Matlab or numpy arrays)?

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on September 17, 2024

Regarding running lots of user threads in GraphBLAS: this will get simpler in V4.0, when GrB_wait(void) is deleted. Currently, to support GrB_wait() with no input arguments, I must maintain a global list of all matrices with pending work, even if those matrices are unrelated and constructed by independent user threads. That's awkward. When GrB_wait(void) is deleted, this global queue of pending matrices will be deleted, and then your user threads will not interact at all.

from graphblas.

eriknw avatar eriknw commented on September 17, 2024

Sounds much better. Thanks for the info. It seems natural to call GrB_wait with arguments.

I admit, I haven't yet stressed GraphBLAS with lots of user threads yet, but I'm eagerly pushing forward!

from graphblas.

victorstewart avatar victorstewart commented on September 17, 2024

I wrote a graph database around GraphBLAS so i touched upon this when persisting my various matrices and vectors to disk as snapshots. this occurs either...

A) during regular operation at various intervals
B) upon process shutdown
or C) when replicating the entire graph to a new instance

for matrices, this is the binary schema I write the data to disk according to (same idea for vectors):

nrows(8) ncols(8) nvec(8) ApSize(8) AhSize(8) AjSize(8) AxSize(8) sparsity(4) jumbled(1) (pad to 512 bytes) ApBytes(end 2MB aligned) AhBytes(end 2MB aligned) AjBytes(end 2MB aligned) AxBytes(end 2MB aligned)

I think the current way of export + import is actually optimal for A) and B) given the initial copy elision and freedom to write the bytes to disk however you please (i use io_uring with 2MB chunks and nr_requests in flight).

C) is a bit trickier. Ideally you could transform the matrices into a copy-on-write state, so that you can stream it over the network without needing to either take the database offline or copy the entire thing which may be memory prohibitive (which is why for the moment I elected to just serialize them all the disk and then use sendfile on a worker thread).

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on September 17, 2024

from graphblas.

eriknw avatar eriknw commented on September 17, 2024

All sounds good. To @victorstewart's point, though, I agree that the current import/export is good enough and efficient enough to support my needs (I dup first then export the copy to leave the original untouched). If I need to send the data over the wire, then the protocol I will be using already samples and compresses if appropriate.

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on September 17, 2024

I now have a draft implementation of GxB_Matrix_serialize / deserialize, as well as the GrB versions from the draft v2.0 spec (but with some required modifications). The draft v2.0 spec has some problems with it and I think will be updated soon.

My serialize/deserialize use lz4 or lz4hc, both for GrB and GxB versions.

These new functions are in the v5.2.0 alpha pre-release. That version passes all my tests, but there are some features drafted but not yet tested (the new GrB_select) and some others not yet written (GrB_apply with the new GrB_IndexUnaryOp).

I won't post v5.2.0 as a stable release until the draft v2.0 spec stabilizes. It will be pre-releases (v5.2.0 alphaN for N = 1, 2, ...) for a while.

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on September 17, 2024

Serialization / deserialization is now fully tested in v5.2.0.alpha10, and the documentation is drafted.

from graphblas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.