Git Product home page Git Product logo

Comments (4)

AtreyeeS avatar AtreyeeS commented on June 24, 2024

Thanks @bkhelifi for bringing this up.
What stacking should do to the MetaData was discussed at length in #4853 without conclusion.

  1. The main point was whether we should have a parallel lists leading to code duplication. An approach with a RootModel and BaseModel was proposed by @adonath (see #4853 (comment))

  2. How much info should be kept on a stacked dataset? Currently we have a minimal approach where we throw away all the meta info and keep only the creation info. If required, a meta container can be created from the meta_table. This approach is obviously ill suited. In #4853 I had initially tried keeping all, but that was ill planned and difficult to maintain.

A similar question might arise for the estimators, where the question would be what meta info is propagated from the individual datasets.

from gammapy.

AtreyeeS avatar AtreyeeS commented on June 24, 2024

What should be the difference between Datasets metadata and a stacked dataset

from gammapy.

bkhelifi avatar bkhelifi commented on June 24, 2024

For the fixity metadata, there is no staking (of course).
For the context metadata, it depends a bit on the retained data model. But if, e.g., it contains the datapipe version, the calibration version, one should keep only one instance as these data will be unique for a fixed release
For the reference metadata, here I can propose that we append the ObsId list...

We have to go through all individual metadata fields and make a proposal to VODF/CTA (ie @kosack , myself, ...). A spreadsheet and then we discuss to decide which to keep as unique, which to append, which to skip...
I think that this is the hardest part of this 'project'

from gammapy.

adonath avatar adonath commented on June 24, 2024

@bkhelifi Internally in Gammapy I think we can almost always just propagate the meta data to the higher level by building hierarchical structures. There is not necessarily a need to reduce the meta data in each step, only if we find performance issues with Pydantic. The reduction can then finally happen when serializing. The problem with reducing the meta data "on the fly" is that different data formats might require different meta data. And "a priori" we cannot know to which format the user will serialize.

What should be the difference between Datasets metadata and a stacked dataset

The metadata for the stacked dataset is transposed and homogenous in the type of datasets. The datasets meta data is not.

from gammapy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.