Git Product home page Git Product logo

Comments (16)

oruebel avatar oruebel commented on June 14, 2024

My feeling is that "help" should probably be required and "source" should be optional. I think having help as a required field is also less of a problem, since it is generally populated via the schema rather than requiring user input.

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

help should be a constant value that gets set when someone declares a new type. I don't think this should be an optional field--if someone feels so compelled to create a new type, they should at least be able to provide a brief help string about what they are creating.

source serves a similar purpose, but gets used when someone instantiates a type. My feelings are similar on this issue. If someone is creating a new container of data, they should be able to provide some description about where it came from.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

First, I'm not clear what "source" offers that "session_description" does not.

There's tons of information that users "should be able to provide" but we shouldn't be prohibitive about letting them save a file if they omit it.

Similarly, users "should be able to provide" an institution, experimenter, lab, experiment_description, and file_create_date without much difficulty, but again, omitting these should not prohibit a user from creating a file. These are optional in the schema, as they should be.

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

I would be okay with making source optional on a per-type basis within write API, and having the API fill it with another field that makes sense. So, in the case of NWBFile, source and session_description will be identical if nothing was provided passed in for source.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

that solution sounds weird to me... pushing a schema-level concern to the API

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

If I understand this issue correctly, the proposal to make source optional was a response to it being required in the API.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

Indeed, I noticed the issue when trying to use the API. I thought it was an API issue because the nwb-schema documentation had no indication of "source" being part of the schema at this level. When @oruebel noted that the API was simply reflecting undocumented changes in the schema, I moved the issue here.

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

Is having a field be required a problem if the extra burden of having to specify another field is handled by a write API?

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

yes, because now every API (Matlab, Igor, R, whatever) needs to implement the same workaround for a schema-level problem.

further, the proposed workaround involves the API storing redundant information in two different attributes.

this is a problem with the schema and it should be fixed in the schema.

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

This isn't a problem with the schema. There were intentions behind this decision. As @oruebel mentioned, one of the reasons for making Interface/NWBContainer the base class for everything was to ensure that every container had a place i.e. source for provenance information. Before we go ahead with removing this requirement, we should talk to everyone who was involved with the decision to see if they are okay with altering the NWB philosophy to make the format easier to develop against.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

from nwb-schema.

t-b avatar t-b commented on June 14, 2024

Just my 2 cents.

I think help and source should stay in the schema as they currently server a purpose in adding information to the type. I'm already using source to add stuff like "Device=ITC1600_Dev_0;Sweep=0;AD=3;ElectrodeNumber=0;ElectrodeName=XXX". I'm also not favouring making things optional as it makes reading the files harder. And also learning from existing files.

I don't have any opinion wrt to pynwb.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

Oh, I certainly don't want them removed @t-b. I just don't want source required.

Your example is an interesting one... you've used this particular attribute to add your own non-spec metadata with your own serialization.

While your point with making things optional is valid, it only holds if it is unambiguous what should go into a given field. Your justification for #50 is a perfect example of this.

On the other hand, your own example here highlights that this field does not meet your goal of easing file reading & learning from existing files, as anyone opening your file needs to (a) know that you've put this info there and (b) written the necessary code to deserialize the extra metadata you've encoded.

More broadly, forcing users to fill in "required" fields that are not strictly required results in junk metadata.... users end up filling source with "brain" or worse, which I would argue is less reliable than if it is empty.

When considering what should be required vs optional, I think we need to look beyond the needs & resources of a few specific labs & institutions to those of the wider community. Having watched the NWB "standard" languish for years with very little community uptake & having watched phsyiologists "try" NWB only to abandon it because of the overhead needed to conform to the schema, I think that ease in writing files is a bigger barrier for the success of project. There's a reason that none of my data from grad school is in NWB: it was not worth my time. If it took less time, perhaps that would have tipped the scale.

So again, the schema already has other useful metadata that is optional. What is it about source that makes it critical information in a way that experimenter etc is not?

from nwb-schema.

t-b avatar t-b commented on June 14, 2024

@neuromusic Thanks for your comments. I do agree that my use of source is more on the fuzzy side of things.

The official documentation for source states

Name of TimeSeries or Modules that serve as the source for the data
contained here. It can also be the name of a device, for stimulus or
acquisition data

which does leave quite some room for interpretation.

Regarding optional vs. required: What is the worser case? Having a user fill in garbage in a required field or creating a custom field with garbage? I'd say the latter, as with required fields people reading the data will have at least some documentation what the field is supposed to be.

from nwb-schema.

nicain avatar nicain commented on June 14, 2024

@t-b Can you expand a little bit on:

creating a custom field with garbage

from nwb-schema.

t-b avatar t-b commented on June 14, 2024

@nicain NWB spec 1.0.x allows to add arbitrary other datasets/groups as long as they have an attribute neurodatat_type set to Custom. This is what I call a custom field. Does that make sense?

from nwb-schema.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.