In <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id=

Supporting writes is pretty important, so I would vote against re

[Discussion] HDF5+GDS+multi-threading about kvikio HOT 5 OPEN

madsbk commented on June 22, 2024 1

[Discussion] HDF5+GDS+multi-threading

from kvikio.

Comments (5)

wence- commented on June 22, 2024

[...]

Any thoughts?

Note that I do not know a lot about the details of this VFD interface in HDF5. So I am therefore maybe being naive.

At what level do you need thread-safety in the VFD interface? It looks to me like you're providing callbacks for read/write that HDF5 can use. If the HDF5 calls are single-threaded, you can presumably do whatever you like internally as long as you expose a "single-thread consistent" interface to HDF5.

Or is it not that easy?

from kvikio.

madsbk commented on June 22, 2024

If the HDF5 calls are single-threaded, you can presumably do whatever you like internally as long as you expose a "single-thread consistent" interface to HDF5.

Correct, the VFD itself can be multi-threaded but Legate uses threads (as opposed to processes) when parallelizing tasks on the same node. E.g., if two Legate tasks runs on the same machine, their calls to hdf5 must be serialized.

from kvikio.

manopapad commented on June 22, 2024

Supporting writes is pretty important, so I would vote against relying on Kerchunk for the long term.
I am favorable to this one, more comments after (3)
Legate specifically might be OK with single-thread-per-process (or at least serialized access from different threads within the same process), so the VFD approach doesn't need to wait on multi-threaded HDF5, at least for Legate. The reason is that we may have to switch to a rank-per-GPU default anyway (for the benefit of other libraries that just don't work under rank-per-node).

The more fundamental problem for Legate is that we would have multiple processes trying to read/write the same HDF5 file; can the VFD approach handle that mode? On another thread you linked to https://forum.hdfgroup.org/t/parallel-read-of-a-single-hdf5-file/7960/4, which seems to suggest that the (only?) way to get safe multi-process access is to use an MPI-based VFD, and Legate is trying to move away from depending on MPI (as that throws a wrench e.g. on redistributability of builds).

Implementing a Legate+Kvikio-aware VFD might be even more work than (2), but it would presumably work out-of-the-box with all HDF5 features.

Also, you possibly have less control over how the underlying file I/O is invoked, so it might not be done in the most performant way possible (this is speculative; possibly this is not an issue, depending on what contract the VFD interface provides to the implementor).

Note: All of the above is from the point of view of Legate; other clients might be more strict about the need for true multi-threading, and not care about including MPI.

So at this point I believe the question is, is it better to go through the "official" VFD extension interface, or only use the HDF5 API up to the point where we get access to the underlying buffers, and from that point on proceed independently. The latter would be less constrained by the main HDF5 library's quirks, and would have clearer performance characteristics, but wouldn't be as fully-featured. Which alternative is more programming effort is unclear.

I am favorable towards (2), but I am absolutely not an expert here.

from kvikio.

madsbk commented on June 22, 2024

The more fundamental problem for Legate is that we would have multiple processes trying to read/write the same HDF5 file; can the VFD approach handle that mode?

In principle, yes. The MPI backend in hdf5 is implemented using a VFD approach. For reading, this should be straightforward but in order to support writing, we would have to implement something similar to MPIO VFD.

from kvikio.

madsbk commented on June 22, 2024

So at this point I believe the question is, is it better to go through the "official" VFD extension interface, or only use the HDF5 API up to the point where we get access to the underlying buffers, and from that point on proceed independently. The latter would be less constrained by the main HDF5 library's quirks, and would have clearer performance characteristics, but wouldn't be as fully-featured.

Very well put.

Which alternative is more programming effort is unclear.

That I can answer, option (2) is significant less work. Particularly, if we want to support parallel write to a single file (uncompressed).

from kvikio.

[Discussion] HDF5+GDS+multi-threading about kvikio HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent