Git Product home page Git Product logo

Comments (34)

neuromusic avatar neuromusic commented on June 14, 2024 2

@ajtritt you are absolutely right that the two proposals are the same wrt to the data they represent. in fact, if I were implementing this in a python class, I would hide the implementation in hidden attributes and expose BOTH approaches as properties of the class, use the math you outlined to query either, and have setters and getters that either compute the timestamp_offset if timestamps_reference_time is passed or compute timestamps_reference_time if timestamp_offset is passed.

so practically, pynwb should be able to both proposals, regardless of what is decided here w/r/t to the schema.

so this discussion is about the

  1. there are cases in which an experimenter does not want to use session_start_time as a the reference to map "experiment time" to "real world time"
  2. sometimes an experimenter wants to define an alternative reference to "real world time"
  3. sometimes an experimenter does not want a reference to "real world time"

from the perspective of someone who needs to write these files and coerce my data into the spec, I find the explicit approach of Proposal 2 to be far more intuitive. Further, Proposal 1 only meets needs 1 & 2.

again, we've already made a decision to break backward compatibility with the schema, hence the "2.0" designation. if there is ever an appropriate time to break more things in the pursuit of clarity and improvement, this is it.

from nwb-schema.

lfrank avatar lfrank commented on June 14, 2024 1

Just to add my final two cents, I don't feel at all strongly about this. I'm perfectly happy for my lab to effectively ignore session_start_time by setting it to '1970-01-01T00:00:00Z' and using the (implicit) assumption (which probably should be made explicit) that session start time is the reference time for all timestamps. But if adding a field is helpful to others, that's fine with me, and I don't have a preference as to how it is done.

My final comment is that I think that if any of us were presented with code that implemented either solution, we would quickly adjust to it and then not think much more about it. As soon as the Technical Advisory Board for NWB is fully formed they can deal with these issues, but until then my suggestion would be for Andrew and Oliver to make a decision that makes sense to them and to move on so we can keep the development going.

from nwb-schema.

tjd2002 avatar tjd2002 commented on June 14, 2024 1

@t-b Yes.

Proposal 1 = "Timestamps_offset"
Proposal 2 = "Timestamps_reference_time"

from nwb-schema.

t-b avatar t-b commented on June 14, 2024 1

I can live with both proposals and would also be fine with Proposal 2. ISO8601 strings to the rescue!

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024 1

Okay, proposal 2 it is... timestamps_reference_time as an ISO8601 string!

Any feelings on what timestamps_reference_time defaults to? I'm guessing most people will just want to store relative (to session_start_time) timestamps, so timestamps_reference_time can default to session_start_time.

@t-b @tjd2002 thoughts?

from nwb-schema.

oruebel avatar oruebel commented on June 14, 2024 1

Okay, proposal 2 it is.

I agree.

timestamps_reference_time can default to session_start_time.

Yes, based on the discussion in #50

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

doesn't this also necessitate a units field and expanding the allowed dtype?

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

"since this will explicitly specify the offset to obtain relative timestamps, eliminating the need for APIs to guess based on range." can you clarify this point?

instead of adding a new field, wouldn't it be more simple to deprecate session_start_time and replace it with timestamps_reference_time as an optional top-level field? however, if people want to use POSIX timestamps (the use case that motivated this), this still necessitates expanding the dtype to include int64 & adding a units field for timestamps, no?

from nwb-schema.

nicain avatar nicain commented on June 14, 2024

Fwiw, I think that every dimension should have a required unit spec, including time

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

@neuromusic timestamps already has a unit field. Right now it is sent to the constant 'Seconds'. That would need to be changed to support this proposal. We would need to allow ints as well.

Getting rid of session_start_time does not preserve backwards compatibility. By adding this field, we preserve the theme of having timestamps relative to a global start.

Regarding my statement about API's guessing.... storing the offset provides a mechanism for explicitly specifying what stored timestamps mean. I am trying to strike the balance of keeping relative timestamps as default, but allowing users to specifying absolute time. By specifying timestamps_offset, users can store absolute time, which they may want to do queries or analyses on. However, this breaks the concept of storing timestamps relative to a start time. By storing timestamps_offset, a user is effectively saying "subtract this number from all timestamps to get timestamps relative to the global start time". By explicitly storing this number, we avoid the situation where a users or API has to inspect timestamps to determine which frame they are in. Under the rules I have laid out, It will always be the case that timestamps - timestamps_offset will always produce timestamps that are relative to session_start_time.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

"Under the rules I have laid out, It will always be the case that timestamps - timestamps_offset will always produce timestamps that are relative to session_start_time."

Currently, timestamps will always produce timestamps that are relative to session_start_time. This proposal adds the overhead of an additional field to resolve this data.

We are already in the middle backwards compatibility breaking changes to the schema, no? ElectrodeTable, etc. I'm happy to break backwards compatibility for increased clarity and generalization.

And to clarify, all timestamps are relative, always. The need here is not absolute timestamps per se but timestamps relative to an explicit session-agnostic date.

  • not all experiments need to cast session-time into absolute time. indeed, many experiments do not have the precision in their data collection to do so. for these scientists, the only need is that the data in the nwb is in a common timebase & for these experiments, session_start_time may be off from timestamp[0] by seconds or minutes or hours

my updated proposal:

  1. we keep session_start_time
  2. we add timestamps_reference_time as an optional top-level field, which all timestamps in the file are referenced to

this has the added benefit that a user using POSIX timestamps does not need to compute the delta between Jan 1970 and session_start_time simply to write a file, as they would need to in your proposal.... they can just drop the POSIX reference time into the relevant field and put an experimentally-relevant value into session_start_time

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

I am a little unclear about what you are proposing. Can you tell me how one would store posix timestamps using this scheme?

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

minimal example

session_start_time = None
timestamps_reference_time = None

# then, for a given timeseries...
timestamps = <array of float64>
timestamps.dtype = float64
timestamps.unit = 'second'

POSIX example

session_start_time = None
timestamps_reference_time = '1970-01-01T00:00:00Z'

# then for a given TimeSeries....

timestamps = <an array of int64>
timestamps.datatype = int64
timestamps.unit = 'POSIX64'

non-POSIX, but "absolute" reference which is different from session start

session_start_time = '2017-08-31T10:40:01-07:00'
timestamps_reference_time = '2017-08-07T00:00:00Z'

# then for a given TimeSeries....

timestamps = <an array of float64>
timestamps.datatype = float64
timestamps.unit = 'second'

in this case, even though we designated that the session started on 8/31 using local time, all timestamps are relative to GMT at an earlier date (perhaps the animal's birth date? or first day of training? or some other experimentally-relevant time that the experimenter wants to reference their timestamps to)

from nwb-schema.

oruebel avatar oruebel commented on June 14, 2024

In the current format, time encodings that are not relative to the global clock would be stored in the sync group of TimeSeries http://nwb-schema.readthedocs.io/en/latest/format.html#id94 . As such, I think it might make sense to define these structures within sync. In this way TimeSeries itself would not change (i.e.., always have relative times) and a specific spec would be added to the sync folder to describe absolute times. To make the sync group easier to extend it would probably be useful to add a neurodata_type: LabTime group that people can extend to describe lab-specific time encodings.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

@oruebel I don't understand. sync addresses a totally separate issue of tracking provenance information related to synchronizing timestamps between timeseries acquired from different hardware. AFTER sync'ing you need to put everything in the same time reference, which what this issue is concerned with.

in my proposal, all timestamps are still relative to the global clock, we're just being more explicit about the the global clock's reference.

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

@neuromusic Thanks for clarifying. So then session_start_time and timestamps_reference_time are both optional, and there are three rules for how they get interpreted?

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

yes, both optional.

three rules?

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

Yeah, the three scenarios you listed and their meanings

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

rule 1: session_start_time is metadata. while machine-readable, the specification offers no guidance on how it should be interpreted beyond what the experimenter describes, however, by convention, this should be the datetime of the experiment.
rule 2: timestamps_reference_time is interpreted as the ISO8601 time where timestamp=0.
rule 3: if timestamps_refernece_time is empty, then the experiment has no ISO8601 reference & its timestamps can only be referenced internally to the NWB file

n.b. I keep saying "file" but I know we want to break free from HDF5 eventually... should I say "container"? "session"?

from nwb-schema.

oruebel avatar oruebel commented on June 14, 2024

What I like about Andrews proposal is that it does not unnecessarily break backwards compatibility. If a user does not specify timestamps_offset then all times are relative to session_start_time as they have always been. It only adds additional functionality without breaking existing functionality.

Refining the specification of session_start_time in terms of defining exactly how it should be stored is I think in general helpful and has been raised in #50

from nwb-schema.

ajtritt avatar ajtritt commented on June 14, 2024

After thinking about this and doing some math, these two proposals are one and the same in terms of how they get used. Here's my explanation for why I think that...

To explain this, let me define some short hands I will use below.
abs := absolute timestamps
rel := relative timestamps
ts_off = timestamps_offset
trt := timestamps_reference_time
sst := session_start_time
ts := timestamps
POSIX(xyz) := the posix time representation of an ISO8601 encoded timestamps

Proposal 1: store offset of timestamps in field timestamps_offset

abs = ts + sst - ts_off
rel = ts - ts_off

If a user has saved absolute timestamps i.e. ts == abs and ts_off == POSIX(sst), then

rel = ts - tsoff

If a users has saved relative timestamps i.e. ts == rel and ts_off == 0, then

abs = ts + sst

Proposal 2: store reference time for timestamps in field timestamps_reference_time

abs = ts + trt
rel = ts + trt - sst

If a users has saved absolute timestamps i.e. ts == abs and trt == 1970-01-01T00:00:00Z, then

rel = ts - POSIX(sst)

If a users has saved relative timestamps i.e. ts == rel and trt=sst, then

abs = ts + trt

With that said, the issue becomes one of human interpretability, so, I think the decision is up to those who will be interpreting.

from nwb-schema.

oruebel avatar oruebel commented on June 14, 2024

@ajtritt that makes sense.

After looking at it a bit more, I think I like the timestamps_offset approach better, mainly because the meaning/behavior of session_start_time does not change compared to the current approach if ``timestamps_offset=0. In contrast, with timestamps_reference_time``` we change the name of the reference and, hence, require folks that are used to the concept of times relative to ```session_start_time``` to change to use ```timestamps_reference_time``` and set ```timestamps_reference_time=session_start_time```

Either way, I think session_start_time should remain a required field.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

Regarding backwards compatibility, in the current schema, there is only an implicit link between session_start_time and the reference time for timestamps: "Timestamps here have all been corrected to the common experiment master-clock. Time is stored as seconds and all timestamps are relative to experiment start time." It is not clear that session_start_time is what is being referenced here. The implicit link between the two is simply a convention that I suspect is largely unused.

So the change I propose does not actually break backwards compatability but instead makes what has been implicit explicit.

The key disadvantage of Proposal 1 is that two values are necessary to represent "absolute" time (both session_start_time and timestamps_offset). Further, the offset must be computed. To represent POSIX time, this means that there is now a burden on the person creating the file to compute the time delta (in POSIX timestamps) between Jan 1 1970 and session_start_time. e.g.

session_start_time = '2017-08-31T10:40:01-07:00'
timestamps_offset = <some computed int64 representing the delta between sst & Jan 1 1970>

# then for a given TimeSeries....

timestamps = <an array of int64>
timestamps.datatype = int64
timestamps.unit = 'POSIX64'

While both proposals represent the data, if I want to save my file with a different session_start_time, then I need to recompute my timestamps_offset.

Under both proposals, on reading, I can't assume that my timestamps are in POSIX64... I need to check what they are referenced to. Under proposal 1, I need to subtract (or add?) the timestamps_offset from session_start_time and check it against '1970-01-01T00:00:00Z'. Under Proposal 2, I just have to match the string saved in timestamps_reference_time against '1970-01-01T00:00:00Z'

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

in general, NWB needs fewer required fields, but I'm willing to save that argument for another day

from nwb-schema.

t-b avatar t-b commented on June 14, 2024

Time is stored as seconds and all timestamps are relative to experiment start time." It is not clear that session_start_time is what is being referenced here. The implicit link between the two is simply a convention that I suspect is largely unused.

I understood it like that and also create NWB files following this convention. It was a quite a hassle to base all measurement data on this common time zero of experiment start.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

how do I get on this technical advisory board?

from nwb-schema.

lfrank avatar lfrank commented on June 14, 2024

Right now the plan is for Lydia to the the Allen representative (I think), so I'd suggest touching base with her or Christoph.

from nwb-schema.

nicain avatar nicain commented on June 14, 2024

My understanding of the issue at play here is that there are two distinct temporal frames of reference under discussion: 1) RealTime (i.e. our shared experience of the world, and ignoring any effects from special relativity), and ExperimentTime. The “timestamps” field of the nwb file provides a coordinate system for ExperimentTime. The original spec requires that a required point, the beginning of the experiment (session_start_time), provide a common reference point to relate this ExperimentTime coordinate system (timestamps[0]) with RealTime, and there has been discussion on this thread about using either ISO_8601 or POSIX as the coordinate system for RealTime.

More controversy arises from the modeling community that claims that this RealTime/ExperimentTime relationship is not relevant for their data, and therefore want to make this session_start_time attribute optional. I agree with this.

Also, I want to point out that the units on the “timestamps“ field are only necessary if 1) there is no RealTime/ExperimentTime correspondence AND 2) there is not other common reference point between the RealTime coordinate system (either ISO_8601 or POSIX) and “timestamps”. This means that no units are required if both session_start_time and something like session_end_time are provided (and of course the value of timestamps[0] and timestamps[-1] coincide with these two fields).

My preference is to adopt ISO_8601 as the coordinate system for RealTime via session_start_time, but make its inclusion optional, as well as a new optional session_stop_time. Also, I would be fine making the “timestamps” unit optional, however if it is not specified then both session_start_time and session_stop_time must be provided. This is a backwards compatible change that increases flexibility for building the files, while ensuring that “timestamps” is always a unitful quantity.

If we cannot agree on a coordinate system for RealTime, then an additional piece of metadata that specifies either ISO_8601 or POSIX will be necessary, required any time session_start_time is required.

from nwb-schema.

neuromusic avatar neuromusic commented on June 14, 2024

@nicain to clarify, the discussion on whether to use POSIX or ISO8601 for "RealTime" reference is #50.

This issue was prompted by the feature request by @lfrank (now retracted, I guess?) to support POSIX values (instead of seconds relative to experiment start) on the "ExperimentTime" timestamps themselves.

from nwb-schema.

tjd2002 avatar tjd2002 commented on June 14, 2024

With NWB 2.x spec being finalized, and now that solid ISO8601 support is in pynwb ( via NeurodataWithoutBorders/pynwb#641 ), it seems like a good time to try to reach a decision on this issue!

I would like to throw my support to @neuromusic 's proposal ("proposal 2") for an additional, optional timestamps_reference_time field that encodes the timestamp zero time as an ISO8601 string. (As opposed to "Proposal 1": an optional field representing an offset (in seconds) that is subtracted from the session_start_time to obtain the reference time).

I believe we can achieve backwards-compatibility by declaring that: "if timestamps_reference_time is not specified, then the timestamp reference time is taken to be the session_start_time".

I (quite strongly) prefer the specification of a reference time (rather than an offset from session_start_time) as being simpler to use and more intuitive during both file write and read. For instance:

  • An important use case for us is to confirm that a set of NWB files (possibly recorded from the same animal across multiple sessions) all share the same time base. As @neuromusic notes, under proposal 2 this reduces to a simple inspection of the "timestamps_reference_time" field. Whereas in proposal 1 we need to do timestamp math with 2 different representations of time (ISO8601 and elapsed time in seconds). (Possibly leading to floating point precision issues, given that an identical reference time would be represented by different sums across files?).
  • When writing data, we typically want to specify a hard-coded reference time. Under proposal 2 this involves no (sadly often error-prone) timestamp math.
  • Using pynwb, all the requisite timestamp information (session_start_time, file_create_date, and timestamps_reference_time can be provided by passing in a timezone-aware datetime object, which leads to clear and expressive code: (with my apologies to PEP8 for alignment!).
# For a file that uses POSIX timestamps

from datetime import datetime
from dateutil import tz

session_start_time        = datetime(2017, 10, 31,  9, 59,  0, tzinfo=tz.gettz('US/Pacific'))
timestamps_reference_time = datetime(1970,  1,  1,  0,  0,  0, tzinfo=tz.tzutc())
file_create_date          = datetime.now(tz.tzlocal())

  • Miscellaneous points:
    1. By "POSIX time", I just mean I want to use seconds since the Unix Epoch (not including leap seconds). Contra some discussion above, we do not want a new 'POSIX64' dtype--we don't want to represent time as anything other than float64's with units of 'Seconds'.
    2. The schema should make it clear that we follow the POSIX convention of ignoring leap seconds in date/time computations. This is what the libraries everyone is relying on will be doing, but doesn't hurt to state it explicitly.
    3. I vote -1 on @nicain's suggestion to add a "session_end_time", at least insofar as it is used to map a range of timestamps onto real-world time. I think we should stick to SI units for time. Also, many different objects in a file have associated timestamps, so it's not particularly meaningful to refer to a global timestamps[0] and timestamps[-1].
    4. This issue also seems to discuss whether session_start_time should be made optional (e.g. for simulation data). I don't have any objection to this.

from nwb-schema.

t-b avatar t-b commented on June 14, 2024

@tjd2002 Proposal 2 from Andrew's post in #49 (comment)?

from nwb-schema.

tjd2002 avatar tjd2002 commented on June 14, 2024

Great that we seem to have a consensus!

timestamps_reference_time can default to session_start_time.

This behavior at the API level sounds works for us and affords a level of backwards compatibility.

Am I correct that this means that timestamps_reference_time will become a required field in the spec, with a default value set by the API at write time? For those who wanted to make session_start_time optional, this might be unwelcome news (though it's not a lot worse for them than the current situation).

from nwb-schema.

t-b avatar t-b commented on June 14, 2024

@tjd2002 Just a FYI. According to http://neurodatawithoutborders.github.io/development_plan this monday (11/05/2018) is the cut off date for schema changes.

from nwb-schema.

tjd2002 avatar tjd2002 commented on June 14, 2024

from nwb-schema.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.