gavalian / hipo Goto Github PK
View Code? Open in Web Editor NEWHigh Performance Output Data format for experimental Physics
High Performance Output Data format for experimental Physics
Some users may rely on the submodule version of lz4
, if find_package(LZ4)
fails. The submodule version's ref
is somewhere between lz4
tags v1.9.0
and v1.9.1
.
The latest tag is v1.9.4
, and although this is a difference in patch version number, the lz4
release notes include several updates, including some performance improvements.
Currently the schema parse command should not include spaces after the commas, like this:
schemaPart.parse("pid/S,px/F,py/F,pz/F");
With spaces introduced the behavior is erratic. This should be either allowed or a proper warning / error / throw should be done if a space is present.
I'm using the pyhton API you presented at the last software tutorial to read hipo files (super useful!!) but some bank columns are not retrieved correctly.
After a bit of investigation, I think that the issue is due to a column "type" missing definition.
Specifically, the missing definition is in the getType method in hipolib.py which establishes whether you want to get an int or a float. For type greater than 4 it does not enter any if case and an empty list is returned
https://github.com/gavalian/hipo/blob/master/extensions/python/hipolib.py#L97
Snippet code to be run in the hipo/extensions/python dir (I provide the path to the file I used)
# to be run in hipo/extensions/python dir
from hipolib import hreader
reader = hreader('../../slib/')
f = reader.open('/work/clas12/vmascagn/ltcc_run006380.hipo')
# define the 2 banks I'm interested in for this example
reader.define('LTCC::adc')
reader.define('RUN::config')
# loop on the first 3 events, print some values just to check, and use the show method to check the column type
counter = 0
while reader.next():
# 2 random baks of LTCC
print('LTCC sector',reader.getEntry('LTCC::adc', 'sector'))
print('LTCC sector',reader.getEntry('LTCC::adc', 'ADC'))
# 2 from RUN::config
print('Number of run',reader.getEntry('RUN::config',"run"))
print('RUN event',reader.getEntry('RUN::config',"event"))
print('RUN trigger',reader.getEntry('RUN::config',"trigger"))
reader.show('RUN::config',"trigger")
print("-"*30)
counter += 1
if counter > 3:
break
The output I get is this one, where I can see the "trigger" value from "RUN::config" bank is not retrieved (an empty list is always returned). I also noticed that the type is "8" which has no defined behavior as explained above.
The same is valid for "timestamp" (even if not printed here)
file open handle = 3
LTCC sector [3, 5]
LTCC sector [0, 186]
Number of run [6380]
RUN event [1]
RUN trigger []
entry = trigger type = 8
------------------------------
LTCC sector [5]
LTCC sector [235]
Number of run [6380]
RUN event [2]
RUN trigger []
entry = trigger type = 8
------------------------------
LTCC sector [3, 3]
LTCC sector [214, 7532]
Number of run [6380]
RUN event [5]
RUN trigger []
entry = trigger type = 8
------------------------------
LTCC sector [3]
LTCC sector [219]
Number of run [6380]
RUN event [7]
RUN trigger []
entry = trigger type = 8
At this moment in the C++ API, the hipo::reader::open method doesn't understand wildcard "*",
Will be good to make it work with wildcards.
hipo4/node.h:83:29: warning: comparison of integer expressions of different signedness: 'std::vector::size_type' {aka 'long unsigned int'} and 'int' [-Wsign-compare]
Not major issue but this warning comes up hundreds of times when compiling gemc.
auto px = particleBank.get("px", row); // type is `double`, even though px is `float` in the schema
auto px = particleBank.get<float>("px", row); // still appears to be a `double`
This is because the default type in template<typename T = double> T get
is set to double
, to give the compiler a "default", but that's not the correct thing to do here. We need to fix the type deduction here, but the types are stored in hipo::Type
, an enumerator:
Lines 36 to 43 in 9122a65
I tried something based on https://stackoverflow.com/a/58622942 but I was unsuccessful, since I need bankSchema
in the trailing return type (to lookup the type), but bankSchema
is not static; I was also not successful simply declaring bankSchema
as static, since parameter item
seems to not be accessible in the trailing return type.
After a few hours trying to resolve this, it's time to ask ChatGPT ๐
Alternatively, give up and remove these templated get
methods (whereas the templated put
seems to work fine, since the type is deduced by the parameter type).
The cmake
project version is:
Line 11 in 380d54e
This discrepancy may confuse downstream dependency resolution, where a minimum version number is useful.
Make is nice, cmake is much better :-)
Create a proper CMakeLists.txt that build the library and installs it in the proper location, also installs the headers, and then makes a hipo4Config.cmake.
(I'm working on it, expect a pull request shortly....)
This repository lacks a license.
For example, given hipo::bank b
with REC::Particle
schema,
b.getFloat("pid", row);
returns 0.0
and no error is printed. All getters should either attempt to typecast or throw an error upon failure.
I found that when I run make on my apple silicon mac ($OSTYPE==darwin22), I get an error:
make: *** [shlib] Error 1
I found that this line is the one it fails on:
@test -f lz4/lib/liblz4.$(SHAREDEXT) && cp lz4/lib/liblz4.$(SHAREDEXT) slib/.
I found that $SHAREDEXT should be dylib, but it is being set to so. The issue is with these lines
SHAREDEXT=so
ifneq (,$(findstring darwin,$(OSTYPE)))
SHAREDEXT=dylib
endif
I found that hardcoding the SHAREDEXT to dylib fixes the issue for my mac.
Somehow it's not setting the SHAREDEXT to dylib
Running Clang's UndefinedBehaviorSanitizer
in iguana
, which depends on this repo, reveals the following alignment
issue in hipo::bank::getFloatAt
:
/__w/iguana/iguana/hipo/lib/pkgconfig/../../include/hipo4/bank.h:116:16: runtime error: load of misaligned address 0x55e52f404da6 for type 'float', which requires 4 byte alignment
0x55e52f404da6: note: pointer points here
00 00 00 00 00 00 80 bf 00 00 80 bf 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
#0 0x7fb0b4ad3ee4 in hipo::structure::getFloatAt(int) const /__w/iguana/iguana/hipo/lib/pkgconfig/../../include/hipo4/bank.h:116:16
#1 0x7fb0b4ad1d73 in hipo::bank::getFloat(char const*, int) const /__w/iguana/iguana/hipo/lib/pkgconfig/../../include/hipo4/bank.h:449:16
#2 0x7fb0b4ae3d26 in iguana::clas12::MomentumCorrection::Run(std::vector<hipo::bank, std::allocator<hipo::bank>>&) const /__w/iguana/iguana/build-iguana/../src/iguana/algorithms/clas12/MomentumCorrection.cc:20:29
#3 0x7fb0b4a9f6f3 in iguana::AlgorithmSequence::Run(std::vector<hipo::bank, std::allocator<hipo::bank>>&) const /__w/iguana/iguana/build-iguana/../src/iguana/algorithms/AlgorithmSequence.cc:11:46
#4 0x55e52d1cab07 in main /__w/iguana/iguana/build-iguana/../src/iguana/tests/iguana-test.cc:103:11
#5 0x7fb0b42dcccf (/usr/lib/libc.so.6+0x25ccf) (BuildId: c0caa0b7709d3369ee575fcd7d7d0b0fc48733af)
#6 0x7fb0b42dcd89 in __libc_start_main (/usr/lib/libc.so.6+0x25d89) (BuildId: c0caa0b7709d3369ee575fcd7d7d0b0fc48733af)
#7 0x55e52d193dd4 in _start (/__w/iguana/iguana/build-iguana/src/iguana/tests/iguana-test+0x7add4) (BuildId: 1a4186039a18b5d38dec21484f462abf569d1f4c)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /__w/iguana/iguana/hipo/lib/pkgconfig/../../include/hipo4/bank.h:116:16 in
For now I've suppressed it in iguana
's sanitizer, but documenting the issue here.
See https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
For the sake of stability for downstream users, it would be nice to have build and test CI jobs. The examples would work nicely. You could also add a benchmark test.
In the Java API, it is possible to call
var particleBank = event.getBank("REC::Particle");
This feature does not appear to be in this C++ API.
I would like to simplify the following example as much as possible; it is an event loop which prints out the contents of REC::Particle
for each event.
The current version of this repository requires the user to get the schema, leading to the following verbose implementation:
int main() {
// read input file
hipo::reader reader;
reader.open("data.hipo");
// get bank schema
hipo::dictionary factory;
reader.readDictionary(factory);
hipo::bank particleBank(factory.getSchema("REC::Particle"));
// event loop
hipo::event event;
while(reader.next(event)) {
event.getStructure(particleBank);
particleBank.show();
}
return 0;
}
This is the desired implementation (or something similar):
int main() {
// read input file
hipo::reader reader;
reader.open("data.hipo");
// event loop
hipo::event event;
while(reader.next(event)) {
event.getBank("REC::Particle").show();
}
return 0;
}
To implement a filtering algorithm with signature
void algorithm(bank::list &l);
which is designed to mask/remove certain undesired rows from certain banks, hipo::bank
would benefit from a masking method.
One possible implementation, which zeros the row:
void hipo::bank::MaskRow(int row) {
for(int item = 0; item < getSchema().getEntries(); item++)
put(item, row, 0);
}
memset
to zero would be faster.
Removing undesired rows from a bank is not ideal, since that would require updating other banks' row references; however, this would be a nicer solution if it could be done cleanly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.