Comments (4)
Thanks @bkhelifi for bringing this up.
What stacking should do to the MetaData was discussed at length in #4853 without conclusion.
-
The main point was whether we should have a parallel lists leading to code duplication. An approach with a
RootModel
andBaseModel
was proposed by @adonath (see #4853 (comment)) -
How much info should be kept on a stacked dataset? Currently we have a minimal approach where we throw away all the meta info and keep only the creation info. If required, a meta container can be created from the meta_table. This approach is obviously ill suited. In #4853 I had initially tried keeping all, but that was ill planned and difficult to maintain.
A similar question might arise for the estimators, where the question would be what meta info is propagated from the individual datasets.
from gammapy.
What should be the difference between Datasets
metadata and a stacked dataset
from gammapy.
For the fixity metadata, there is no staking (of course).
For the context metadata, it depends a bit on the retained data model. But if, e.g., it contains the datapipe version, the calibration version, one should keep only one instance as these data will be unique for a fixed release
For the reference metadata, here I can propose that we append the ObsId list...
We have to go through all individual metadata fields and make a proposal to VODF/CTA (ie @kosack , myself, ...). A spreadsheet and then we discuss to decide which to keep as unique, which to append, which to skip...
I think that this is the hardest part of this 'project'
from gammapy.
@bkhelifi Internally in Gammapy I think we can almost always just propagate the meta data to the higher level by building hierarchical structures. There is not necessarily a need to reduce the meta data in each step, only if we find performance issues with Pydantic. The reduction can then finally happen when serializing. The problem with reducing the meta data "on the fly" is that different data formats might require different meta data. And "a priori" we cannot know to which format the user will serialize.
What should be the difference between Datasets metadata and a stacked dataset
The metadata for the stacked dataset is transposed and homogenous in the type of datasets. The datasets meta data is not.
from gammapy.
Related Issues (20)
- Returning Alpha Map from 'ExcessMapEstimator' HOT 2
- Consistency with 'references' in docstrings
- Exposure correction for `MapDataset.to_region_map_dataset()`
- FluxPoints.write() is ignoring overwrite when file extension is not FITS
- Writing `EventList` with `Observation.write` set `MJDREFI` and `MJDREFF` to 0
- Consistency between MapDataset.stack and Datasets.stack_reduce HOT 3
- MapDatasetOnOff Conversion - Issues with Definition of Alpha
- FluxPointsEstimator fails if no edisp is set HOT 3
- Adapt code style and formatting CI to use precommit.ci HOT 1
- Simplify the Sensitivity Estimator Notebook HOT 1
- Is lgtm.yml being used? HOT 1
- FluxPointsEstimator fails on list of multiple stacked datasets because meta_table cannot be created HOT 4
- Fit.stat_contour and stat_surface yield incorrect parameter names HOT 4
- `coord_to_pix()` returns unexpected results for WcsGeom in AIT projection
- Update docs for `JFactory` integral calculation
- Expose `FitResults` in the documentation HOT 6
- Remaining issues with numpy 2.0
- y-axis value not visible in RadMax2D.plot_rad_max_vs_energ
- Hide spectrum x-axis tick labels in `plot_fit`
- Task list for Sensitivity estimations HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gammapy.