Git Product home page Git Product logo

Comments (17)

mkitti avatar mkitti commented on June 4, 2024

I'm trying to figure out if HDF5 compound types support self-referential types like this.

from hdf5.jl.

lukas-weber avatar lukas-weber commented on June 4, 2024

When I had a short look, I couldn’t find it. If it doesn’t, one approach would be to wrap it in a variable length array type that stores all referenced copies together with indices that show their relation.

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024

I just tried this patch and this does not seem to work either:

diff --git a/src/typeconversions.jl b/src/typeconversions.jl
index 1e82c659..a3f3a6b3 100644
--- a/src/typeconversions.jl
+++ b/src/typeconversions.jl
@@ -73,7 +73,8 @@ function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
     dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
     for (idx, fn) in enumerate(fieldnames(T))
         ftype = fieldtype(T, idx)
-        API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), hdf5_type_id(ftype))
+        _hdf5_type_id = ftype == T ? dtype : hdf5_type_id(ftype)
+        API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
     end
     return dtype
 end

I get the following:

julia> using HDF5

julia> struct A
           x::A
       end

julia> HDF5.hdf5_type_id(A)
ERROR: HDF5.API.H5Error: Error adding field x to compound datatype
libhdf5 Stacktrace:
 [1] H5Tinsert: Invalid arguments to routine/Bad value
     can't insert compound datatype within itself

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024
  1. We could proceed with my patch which essentially allows the HDF5 library itself to error.
  2. We could just detect the condition and error directly. This may allow for us to provide a more direct and friendlier error message.

from hdf5.jl.

simonbyrne avatar simonbyrne commented on June 4, 2024

Is this even supported by Julia? How do you construct an instance of A?

from hdf5.jl.

simonbyrne avatar simonbyrne commented on June 4, 2024

ah, you have to do it via inner constructors... the problem is that you end up with an undef at some point, which I don't think we can really support.

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024
julia> mutable struct A
           x::A
           function A()
               self = new()
               self.x = self
               return self
           end
       end

julia> a = A()
A(A(#= circular reference @-1 =#))

julia> pointer_from_objref(a)
Ptr{Nothing} @0x000000770cacabc0

julia> pointer_from_objref(a.x)
Ptr{Nothing} @0x000000770cacabc0

julia> unsafe_load(Ptr{Ptr{Nothing}}(pointer_from_objref(a)))
Ptr{Nothing} @0x000000770cacabc0

For a mutable struct, this just turns into a bunch of pointers.

To store this kind of structure in HDF5 I think we would need some kind of analog to pointers. Either we store the array index to what is being pointed at or we use references some how.

That said I'm not sure if I can think of a way to do that generically.

from hdf5.jl.

lukas-weber avatar lukas-weber commented on June 4, 2024

Actually, I think there are non-obvious ways to create multi-type self-referential cycles that still freeze the patched hdf5_type_id

mutable struct B
   x::Any
end

struct A
   x::B
end

a = A(B(nothing))
a.x.x=a

So to catch everything, the types encountered so far all need to be kept track of. Serializing this kind of stuff generically seems quite challenging. Maybe it is possible using the HDF5-native reference system?

from hdf5.jl.

simonbyrne avatar simonbyrne commented on June 4, 2024

To store this kind of structure in HDF5 I think we would need some kind of analog to pointers. Either we store the array index to what is being pointed at or we use references some how.

One solution would be to only support isbitstypes by default: any other Julia types would require some sort of manual conversion function.

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024

I think this would be fine if it errored instead of hung. I think we need a type id cache to break cycles.

from hdf5.jl.

lukas-weber avatar lukas-weber commented on June 4, 2024

Self-referential structs are not isbitstype so failing for non-isbitstypes also removes any hangs.

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024

My current solution is the following.

import HDF5: hdf5_type_id, API

function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
    cache = try
        task_local_storage(:hdf5_type_id_cache)::Dict{DataType,Int}
    catch
        task_local_storage(:hdf5_type_id_cache, Dict{DataType, Int}())
    end
    if haskey(cache, T)
        error("Cannot create a HDF5 datatype with fields containing that datatype.")
    end
    dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
    cache[T] = dtype
    try
        for (idx, fn) in enumerate(fieldnames(T))
            ftype = fieldtype(T, idx)
            _hdf5_type_id = hdf5_type_id(ftype)
            API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
        end
    catch err
        rethrow(err)
    finally
        delete!(cache, T)
    end
    return dtype
end

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024
julia> struct C{A}
           x::A
       end

julia> struct B{A}
           x::C{A}
       end

julia> struct A
           x::B{A}
       end

julia> HDF5.hdf5_type_id(A)
ERROR: Cannot create a HDF5 datatype with fields containing that datatype.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
    @ Main ./REPL[3]:8
  [3] hdf5_type_id
    @ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
  [4] hdf5_type_id(#unused#::Type{C{A}}, isstruct::Val{true})
    @ Main ./REPL[3]:15
  [5] hdf5_type_id
    @ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
  [6] hdf5_type_id(#unused#::Type{B{A}}, isstruct::Val{true})
    @ Main ./REPL[3]:15
  [7] hdf5_type_id
    @ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
  [8] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
    @ Main ./REPL[3]:15
  [9] hdf5_type_id(#unused#::Type{A})
    @ HDF5 ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71
 [10] top-level scope
    @ REPL[13]:1

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024

Any feedback on the above?

from hdf5.jl.

lukas-weber avatar lukas-weber commented on June 4, 2024

It seems pretty bullet proof to me. I’m not an expert on task_local_storage, but also don’t see a simpler way to do without. In any case, it is much better than the current behavior.

from hdf5.jl.

simonbyrne avatar simonbyrne commented on June 4, 2024

Can we even write non-isbitstypes? Why not just throw an error?

from hdf5.jl.

mkitti avatar mkitti commented on June 4, 2024

A mutable struct is not a bitstype. I'm not sure if we can write them yet, but I do think it would be possible in the future for us to write some simple mutable structs. If a mutable struct contains all bitstypes, then I think we can map it to a corresponding NamedTuple which we could then write.

julia> struct Foo
           x::Int
       end

julia> mutable struct Bar
           x::Int
       end

julia> HDF5.Datatype(HDF5.hdf5_type_id(Foo))
HDF5.Datatype: H5T_COMPOUND {
      H5T_STD_I64LE "x" : 0;
   }

julia> HDF5.Datatype(HDF5.hdf5_type_id(Bar))
HDF5.Datatype: H5T_COMPOUND {
      H5T_STD_I64LE "x" : 0;
   }

julia> isbitstype(Foo)
true

julia> isbitstype(Bar)
false

julia> to_namedtuple(x::T) where T = NamedTuple{fieldnames(T)}(ntuple(i->getfield(x,i), fieldcount(T)))
to_namedtuple (generic function with 1 method)

julia> to_namedtuple(Bar(5))
(x = 5,)

from hdf5.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.