Comments (17)
I'm trying to figure out if HDF5 compound types support self-referential types like this.
from hdf5.jl.
When I had a short look, I couldn’t find it. If it doesn’t, one approach would be to wrap it in a variable length array type that stores all referenced copies together with indices that show their relation.
from hdf5.jl.
I just tried this patch and this does not seem to work either:
diff --git a/src/typeconversions.jl b/src/typeconversions.jl
index 1e82c659..a3f3a6b3 100644
--- a/src/typeconversions.jl
+++ b/src/typeconversions.jl
@@ -73,7 +73,8 @@ function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
for (idx, fn) in enumerate(fieldnames(T))
ftype = fieldtype(T, idx)
- API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), hdf5_type_id(ftype))
+ _hdf5_type_id = ftype == T ? dtype : hdf5_type_id(ftype)
+ API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
end
return dtype
end
I get the following:
julia> using HDF5
julia> struct A
x::A
end
julia> HDF5.hdf5_type_id(A)
ERROR: HDF5.API.H5Error: Error adding field x to compound datatype
libhdf5 Stacktrace:
[1] H5Tinsert: Invalid arguments to routine/Bad value
can't insert compound datatype within itself
from hdf5.jl.
- We could proceed with my patch which essentially allows the HDF5 library itself to error.
- We could just detect the condition and error directly. This may allow for us to provide a more direct and friendlier error message.
from hdf5.jl.
Is this even supported by Julia? How do you construct an instance of A
?
from hdf5.jl.
ah, you have to do it via inner constructors... the problem is that you end up with an undef
at some point, which I don't think we can really support.
from hdf5.jl.
julia> mutable struct A
x::A
function A()
self = new()
self.x = self
return self
end
end
julia> a = A()
A(A(#= circular reference @-1 =#))
julia> pointer_from_objref(a)
Ptr{Nothing} @0x000000770cacabc0
julia> pointer_from_objref(a.x)
Ptr{Nothing} @0x000000770cacabc0
julia> unsafe_load(Ptr{Ptr{Nothing}}(pointer_from_objref(a)))
Ptr{Nothing} @0x000000770cacabc0
For a mutable struct, this just turns into a bunch of pointers.
To store this kind of structure in HDF5 I think we would need some kind of analog to pointers. Either we store the array index to what is being pointed at or we use references some how.
That said I'm not sure if I can think of a way to do that generically.
from hdf5.jl.
Actually, I think there are non-obvious ways to create multi-type self-referential cycles that still freeze the patched hdf5_type_id
mutable struct B
x::Any
end
struct A
x::B
end
a = A(B(nothing))
a.x.x=a
So to catch everything, the types encountered so far all need to be kept track of. Serializing this kind of stuff generically seems quite challenging. Maybe it is possible using the HDF5-native reference system?
from hdf5.jl.
To store this kind of structure in HDF5 I think we would need some kind of analog to pointers. Either we store the array index to what is being pointed at or we use references some how.
One solution would be to only support isbitstype
s by default: any other Julia types would require some sort of manual conversion function.
from hdf5.jl.
I think this would be fine if it errored instead of hung. I think we need a type id cache to break cycles.
from hdf5.jl.
Self-referential structs are not isbitstype
so failing for non-isbitstype
s also removes any hangs.
from hdf5.jl.
My current solution is the following.
import HDF5: hdf5_type_id, API
function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
cache = try
task_local_storage(:hdf5_type_id_cache)::Dict{DataType,Int}
catch
task_local_storage(:hdf5_type_id_cache, Dict{DataType, Int}())
end
if haskey(cache, T)
error("Cannot create a HDF5 datatype with fields containing that datatype.")
end
dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
cache[T] = dtype
try
for (idx, fn) in enumerate(fieldnames(T))
ftype = fieldtype(T, idx)
_hdf5_type_id = hdf5_type_id(ftype)
API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
end
catch err
rethrow(err)
finally
delete!(cache, T)
end
return dtype
end
from hdf5.jl.
julia> struct C{A}
x::A
end
julia> struct B{A}
x::C{A}
end
julia> struct A
x::B{A}
end
julia> HDF5.hdf5_type_id(A)
ERROR: Cannot create a HDF5 datatype with fields containing that datatype.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
@ Main ./REPL[3]:8
[3] hdf5_type_id
@ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
[4] hdf5_type_id(#unused#::Type{C{A}}, isstruct::Val{true})
@ Main ./REPL[3]:15
[5] hdf5_type_id
@ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
[6] hdf5_type_id(#unused#::Type{B{A}}, isstruct::Val{true})
@ Main ./REPL[3]:15
[7] hdf5_type_id
@ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
[8] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
@ Main ./REPL[3]:15
[9] hdf5_type_id(#unused#::Type{A})
@ HDF5 ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71
[10] top-level scope
@ REPL[13]:1
from hdf5.jl.
Any feedback on the above?
from hdf5.jl.
It seems pretty bullet proof to me. I’m not an expert on task_local_storage
, but also don’t see a simpler way to do without. In any case, it is much better than the current behavior.
from hdf5.jl.
Can we even write non-isbitstype
s? Why not just throw an error?
from hdf5.jl.
A mutable struct is not a bitstype. I'm not sure if we can write them yet, but I do think it would be possible in the future for us to write some simple mutable structs. If a mutable struct contains all bitstypes, then I think we can map it to a corresponding NamedTuple
which we could then write.
julia> struct Foo
x::Int
end
julia> mutable struct Bar
x::Int
end
julia> HDF5.Datatype(HDF5.hdf5_type_id(Foo))
HDF5.Datatype: H5T_COMPOUND {
H5T_STD_I64LE "x" : 0;
}
julia> HDF5.Datatype(HDF5.hdf5_type_id(Bar))
HDF5.Datatype: H5T_COMPOUND {
H5T_STD_I64LE "x" : 0;
}
julia> isbitstype(Foo)
true
julia> isbitstype(Bar)
false
julia> to_namedtuple(x::T) where T = NamedTuple{fieldnames(T)}(ntuple(i->getfield(x,i), fieldcount(T)))
to_namedtuple (generic function with 1 method)
julia> to_namedtuple(Bar(5))
(x = 5,)
from hdf5.jl.
Related Issues (20)
- Test failures in h5a_iterate HOT 1
- Changed requirements in HDF5_jll's `libhdf5.so` for `libcurl.so`? HOT 8
- Can't get HDF5.jl work with Julia running in docker (julia:1.8-alpine3.17) - can't find libmpi.so.12 HOT 6
- Add mid/high level interface for HDF5 Dimension Scale HOT 1
- Writing scalar datasets of compound types HOT 3
- Get rid of HISTORY.md? HOT 1
- Segfault when writing variable length string as attribute HOT 8
- Feature request - add support for SparseMatrixCSC HOT 1
- Support szip (freely) HOT 5
- Installing HDF5.jl on ARM M1 HOT 5
- HDF5.jl triggers segfault in ccall with openmp+clang(m1) with julia 1.10 HOT 20
- Inconsistent writing of complex data inside compound type HOT 1
- `set_libraries!()` fails on fresh install HOT 7
- h5_is_library_threadsafe() gives unreliable results due to unspecified initial value HOT 1
- The HDF Group CI
- Examine error handling per thread
- The H5T_BITFIELD class should not be directly mapped to Bool HOT 1
- Would it be possible to be able to save @enum values?
- View to a subset of a dataset HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hdf5.jl.