Git Product home page Git Product logo

bids-validator's People

Contributors

adam2392 avatar choldgraf avatar chrisgorgo avatar david-nishi avatar dependabot[bot] avatar depfu[bot] avatar dimitripapadopoulos avatar effigies avatar etiennebergeron avatar exbotanical avatar happy5214 avatar hoechenberger avatar jasmainak avatar konantian avatar marcocastellaro avatar mariehbourget avatar maxvandenboom avatar mnoergaard avatar naveau avatar nellh avatar oesteban avatar patsycle avatar remi-gau avatar rob-luke avatar robertoostenveld avatar rwblair avatar sappelhoff avatar setcodestofire avatar sjeung avatar suyashdb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bids-validator's Issues

Acknowledge version control

The validator complains about a present .git directory. It would be useful to have it ignore Git or SVN (etc.) files and directories. Reporting each of those files can easily yield several thousand reports and clutter the output.

Validation of .bvec/.bval files

Each _dwi.nii[.gz] file need to have .bvec .bval defined (at any level of the hierarchy - similarly to _events.tsv). Those files need to meet the following specification https://docs.google.com/document/d/1HFUkAEE-pB-angVcYe6pf_-fVf4sCpOHKesUvfb8Grc/edit#heading=h.xfuiufnb319

Things to check:

  • .bvec has as 3 rows
  • .bval has one row
  • number of columns in .bvec file, number of values (columns) in .bval file and number or volumes in the corresponding _dwi file all have to be the same

Ignores suggested meta data location

/sub-04/ses-r08/func/sub-04_ses-r08_task-orientation_rec-dico_run-01_bold.nii.gz
         You have to define 'RepetitionTime' for this file. It can be included one of the following locations: /sub-04/ses-r08/func/sub-04_ses-r08_task-orientation_rec-dico_run-01_bold.json, /sub-04/ses-r08/sub-04_ses-r08_task-orientation_rec-dico_bold.json, /sub-04/sub-04_task-orientation_rec-dico_bold.json, /task-orientation_rec-dico_bold.json

but

% grep RepetitionTime task-orientation_rec-dico_bold.json
  "RepetitionTime": 2.0,

maybe an issue with "rec-dico"?

Validity of JSON files should be tested before meta-data availability

At present, meta-data checks (e.g. RepetitionTime specification) run before JSON validity checks. In case of a typo in the JSON (it happended to me once!) this leads to countless "false-positives". If nothing speaks against it, switching the order of these checks could help people retain more hair ;-)

See #94 for an example of me being misguided.

hardcoded paths and a test that should be failing

I recently noticed that a path in one of the tests is hardcoded to a value that works only on @constellates workstation. Despite that fact the tests are still passing (both on my machine and on CircleCI) even though they should fail, because the file does not exists. Maybe for some reason the callback with the assert was not called?

Error reading NIFTI file

Im getting this error when validating this file (on the command line):

/Users/filo/drive/workspace/bids-validator/node_modules/nifti-js/nifti.js:25
    throw new Error("The buffer is not even large enough to contain the minimal header I would expect from a NIfTI file!")
    ^

Error: The buffer is not even large enough to contain the minimal header I would expect from a NIfTI file!
    at Object.parseNIfTIHeader (/Users/filo/drive/workspace/bids-validator/node_modules/nifti-js/nifti.js:25:11)
    at /Users/filo/drive/workspace/bids-validator/utils/files.js:125:36
    at Gunzip.onEnd (zlib.js:227:5)
    at emitNone (events.js:72:20)
    at Gunzip.emit (events.js:166:7)
    at endReadableNT (_stream_readable.js:903:12)
    at doNTCallback2 (node.js:439:9)
    at process._tickCallback (node.js:353:17)
orange:bids-validator filo$ open /tmp/AA//sub-S2529LVY1263171/func/sub-S2529LVY1263171_task-rest_acq-RL_run-2_sbref.nii.gz

Hangs with dangling symlink

If the validator hits a symlink that points straight to Valhalla it hangs forever and eventually exists with an exception -- without producing any other output. I guess it should simply report a dangling symlink as such. Those can be present of one uses git-annex for data management.

Slice encoding direction was only partially removed.

I can't tell if #83 was intended to only remove the check that SliceEncodingDirection is correctly defined, that it is defined at all or both.

It removed the check to see if SliceEncodingDirection is defined (throws issue 14)
https://github.com/INCF/bids-validator/pull/83/files#diff-60709260fc11f2ede4c5ea6724588606L147
but removed the issue definition for issue 35.
https://github.com/INCF/bids-validator/pull/83/files#diff-2acab63a7435ea4ff6b83cfb4c840f52L145

There's still a check that throws issue 35
https://github.com/INCF/bids-validator/blob/master/validators/json.js#L98
and an issue definition for 14
https://github.com/INCF/bids-validator/blob/master/utils/issues.js#L61

It's a quick fix, but I'm not sure which check the original merge was meant to remove or if it was meant to remove both.

ReptitionTime Rounding in NifTIs

Another thing I've noticed when running the validator is that very often I'll get warnings like the following:

1: Repetition time did not match between the scan's header and the associated JSON metadata file.
(code: 12)
/sub/ses/func/_func_scan_name__bold.nii.gz
Repetition time defined in the JSON (2.0 sec.) did not match the one defined in the NIFT
I header (2.01 sec.)

When I inspect the NifTI headers with 3dinfo, I do indeed get 2.01 seconds. However, when I open up the original DICOM headers I get 2000 ms. My guess is that there is some sort of rounding error when the NifTI headers are created (we've been using mcverter). Would it be possible to make the validator more robust to this sort of scenario (maybe using something similar to numpy's isclose method?).

command line client silently fails to read .nii.gz files

When I use the command line client on a dataset it exits with 0 and no output (not even saying that the dataset is valid). I did some investigations and found that this callback is never called https://github.com/INCF/bids-validator/blob/master/utils/files.js#L119. I looked a bit more and found that this is the first commit introduces this bug: INCF/bids-validator@3507444. I have seen this on both node4 and node5.

The fact that this error is silent is really troubling...

choking on big dataset

I have tried to run the validator on a large dataset (>100 subjects) and it seems to go forever without spitting out any output. I'm happy to provide access to the TACC system if you want to try it out there directly.

Error in 0.16.1

Recently I upgraded the BIDS validator to 0.16.1 on one of our servers. When I tried to point it to a directory, I got the following error:

/usr/local/lib/node_modules/bids-validator/validators/session.js:50
all_files = new Set(subject_files);
^
ReferenceError: Set is not defined
at missingSessionFiles (/usr/local/lib/node_modules/bids-validator/validators/session.js:50:21)
at /usr/local/lib/node_modules/bids-validator/validators/bids.js:187:50
at /usr/local/lib/node_modules/bids-validator/node_modules/async/lib/async.js:52:16
at done (/usr/local/lib/node_modules/bids-validator/node_modules/async/lib/async.js:246:17)
at /usr/local/lib/node_modules/bids-validator/node_modules/async/lib/async.js:44:16
at /usr/local/lib/node_modules/bids-validator/validators/bids.js:180:33
at NIFTI (/usr/local/lib/node_modules/bids-validator/validators/nii.js:220:5)
at /usr/local/lib/node_modules/bids-validator/validators/bids.js:178:29
at Gunzip. (/usr/local/lib/node_modules/bids-validator/utils/files.js:126:21)
at Gunzip.EventEmitter.emit (events.js:95:17)

Any idea what might be going on? Does your code left-pad at any point?

Code 38: Each session should contain the same number of files with the same naming scheme.

When I run the BIDS validator on some data with multiple visits I get the following error:

3: Not all sessions contain the same corresponding files. Each session should contain the same number of > files with the same naming scheme. (code: 38)

This is then followed by many, many lines stating that a number of files should exist according to the validator, but don't. The problem I'm having is that in our study, for some visits/sessions some acquisitions are acquired that aren't acquired for other visits/sessions, so pretty much all of these hypothetical missing files are for acquisitions that wouldn't be taken in a given visit. Is this something that BIDS mandates or is this more meant as a warning for studies where multiple sessions should have the same number/type of acquisitions?

Please add continuous integration testing

  1. Go to circleCI.com
  2. Log in with a github account with admin rights to Squishymedia/bids-validator
  3. Add (plus sign on the left hand side menu) the Squishymedia/bids-validator repo
  4. You are done! circleCI will automatically recognise this as node.js project with mocha testing, install all of the dependencies and run tests. This is how the end result will look like: https://circleci.com/gh/chrisfilo/bids-validator/1

This way test will be run for each commit and each PR making sure nothing is getting broken. I would've done this myself, but I don't have admin rights to this repo.

Inheritance principle does not seem to apply for a single, real fieldmap image (case 3) JSON

I have a fieldmap JSON at the root of a BIDS directory. This JSON contains a "Units" field to be inherited by nested NifTIs with the optional "IntendedFor" field omitted. I've tried two file names for the fieldmap JSON, neither of which seem to work: ses-NFB3_fieldmap.json and fieldmap.json.

   1: You have to define 'Units' for this file. (code: 17)
           /sub-A00037112/ses-NFB3/fmap/sub-A00037112_ses-NFB3_fieldmap.nii.gz
           /sub-A00038998/ses-NFB3/fmap/sub-A00038998_ses-NFB3_fieldmap.nii.gz
           /sub-A00052181/ses-NFB3/fmap/sub-A00052181_ses-NFB3_fieldmap.nii.gz
           /sub-A00052340/ses-NFB3/fmap/sub-A00052340_ses-NFB3_fieldmap.nii.gz
           /sub-A00054504/ses-NFB3/fmap/sub-A00054504_ses-NFB3_fieldmap.nii.gz
           /sub-A00054857/ses-NFB3/fmap/sub-A00054857_ses-NFB3_fieldmap.nii.gz
           /sub-A00054914/ses-NFB3/fmap/sub-A00054914_ses-NFB3_fieldmap.nii.gz
           /sub-A00055121/ses-NFB3/fmap/sub-A00055121_ses-NFB3_fieldmap.nii.gz
           /sub-A00055373/ses-NFB3/fmap/sub-A00055373_ses-NFB3_fieldmap.nii.gz
           /sub-A00055446/ses-NFB3/fmap/sub-A00055446_ses-NFB3_fieldmap.nii.gz
           /sub-A00055447/ses-NFB3/fmap/sub-A00055447_ses-NFB3_fieldmap.nii.gz
           /sub-A00055738/ses-NFB3/fmap/sub-A00055738_ses-NFB3_fieldmap.nii.gz
           /sub-A00056452/ses-NFB3/fmap/sub-A00056452_ses-NFB3_fieldmap.nii.gz
           /sub-A00057005/ses-NFB3/fmap/sub-A00057005_ses-NFB3_fieldmap.nii.gz
           /sub-A00057035/ses-NFB3/fmap/sub-A00057035_ses-NFB3_fieldmap.nii.gz
           /sub-A00057182/ses-NFB3/fmap/sub-A00057182_ses-NFB3_fieldmap.nii.gz

   1: This file is not part of the BIDS specification, make sure it isn't included in the dataset by accident. >Data derivatives (processed data) should be placed in /derivatives folder. (code: 1)
           /ses-NFB3_fieldmap.json
                   Evidence: ses-NFB3_fieldmap.json

Run gulp build as a smoke test

To improve code quality we should incorporate gulp build (generation of the web version of the validator) as a smoke test (run it even if we are not pushing to gh-pages). This would let us catch errors such as #112 before merging.

_defacemask for BOLD

Here is the file in question:

/sub-16/ses-r08/func/sub-16_ses-r08_task-orientation_rec-dico_run-05_recording-motion_physio.tsv

It contains motion estimates as part of a scanner-side distortion correction (corrected BOLD data is stored as "rec-dico"). AFAICS this should be valid. But the validator thinks otherwise.

Test consistency of scanning parameters

Several basic scanning parameters should be read from data and checked for consistency across subjects:

  • data dimensionality (x, y, z and time when applicable - _bold and _dwi)
  • resolution (in mm for the first three dimensions and seconds for the last one when applicable - _bold and _dwi)

Those parameters can be read from NIFTI headers (see here for details: http://brainder.org/2012/09/23/the-nifti-file-format/)

The consistency of those parameters should be checked across subjects (not necessarily across sessions). For each parameter and scan, a value expressed by the majority of subjects should be found and subjects deviating from this value should be reported in a warning.

This check is similar to nipreps/mriqc#29

Not validating top-level JSONs that are inherited by scans in multiple session configuration

The validator says that I need to define 'Repetition Time' for the following NifTI (the directory structure is configured to accommodate multiple visits using 'ses-'):

/sub-A00028185/ses-NFB3/func/sub-A00028185_ses-NFB3_task-DMNTRACKINGTEST_bold.nii.gz

This, however is defined in the following top-level JSON file:

/task-DMNTRACKINGTEST_bold.json

Ditto for other fields ('EchoTime', etc). I don't get any errors, it just isn't pairing the top-level file with the nested NifTIs.

Report warnings for inconsistent or missing sessions

A common problem with experimental data are missing scans for one or two subjects. We should give warning in such situations. I propose the following procedure:

  • For session label session_id
    • For every subject calculate a set of file names for session session_id, remove subject identifiers from those file names.
    • Calculate a union of those sets across all subjects.
    • Report those files from the union that are not preset for all subjects. List which subjects are missing particular file in the warning message

Move CircleCI config details to circle.yml

Currently some of the CircleCI build commands are stored in CircleCI servers (in Settings) rather than circle.yml. We should keep everything (apart from NPM passwords) in circle.yml.

Tests are out of data

$ npm test

> [email protected] test /Users/filo/drive/workspace/bids-validator-1
> mocha tests/



  BIDS
    1) should verify that NifTi files are compressed using gzip.

  JSON
    2) should catch missing closing brackets
    3) should have key/value pair for "repetition_time"

  TSV
    ✓ should not allow contiguous spaces
    ✓ should not allow different length rows
    ✓ should not allow headers to begin with numbers


  3 passing (24ms)
  3 failing

  1) BIDS should verify that NifTi files are compressed using gzip.:
     TypeError: path must be a string
      at TypeError (native)
      at Object.fs.readdirSync (fs.js:813:18)
      at getFiles (utils.js:85:20)
      at Object.readDir (utils.js:66:17)
      at Object.module.exports [as BIDS] (validators/bids.js:211:11)
      at Context.<anonymous> (tests/bids.spec.js:18:18)

  2) JSON should catch missing closing brackets:
     TypeError: Cannot read property 'hasOwnProperty' of null
      at repetitionTime (validators/json.js:64:21)
      at Object.module.exports [as JSON] (validators/json.js:27:13)
      at Context.<anonymous> (tests/json.spec.js:7:12)

  3) JSON should have key/value pair for "repetition_time":
     TypeError: callback is not a function
      at Object.module.exports [as JSON] (validators/json.js:31:9)
      at Context.<anonymous> (tests/json.spec.js:14:12)



npm ERR! Test failed.  See above for more details.

It seems that the tests are out of date - for example the .nii.gz tests calls the validator with a dictionary of files instead of a dir name.

[idea] error and warning codes

Right now a dataset that is very close to being BIDS compatible can generate a lot of errors and warnings (imaging RepetitionTime was missing - it takes one line on the top level to fix it but validator will show as many errors as there are bold files). This can be ver off-putting and overwhelming from the user perspective. One solution would be to group errors and warnings of the same type in the UI. To allow that we can add error/warning codes (as one more field in the error class).

Let me know what do you think.

Error reading some .nii.gz files

I'm getting errors reading this file https://drive.google.com/folderview?id=0B2JWN60ZLkgkcndhTWpLYmswRDA&usp=sharing&tid=0B2JWN60ZLkgkN0dMTVQ1QU1IUEk (actually all nifti files from this dataset - ds109):

On commandline:

orange:bids-validator filo$ node nameOfFile.js /Volumes/Samsung_T1/bids_examples/symlinked/ds109/sub-01/anat/sub-01_T1w.nii.gz
{ error: 'Unable to read /Volumes/Samsung_T1/bids_examples/symlinked/ds109/sub-01/anat/sub-01_T1w.nii.gz' }

With the online version when I try to validate full version of ds109 I get an JS error:

r   @   app.min.js:24606
i.onloadend @   app.min.js:30419

(which is also concerning - why would I get a different error on command line and online?)

BTW the file is not corrupted:

orange:bids-validator filo$ fslinfo /Volumes/Samsung_T1/bids_examples/symlinked/ds109/sub-01/anat/sub-01_T1w.nii.gz
data_type      INT16
dim1           144
dim2           192
dim3           192
dim4           1
datatype       4
pixdim1        1.200000
pixdim2        1.197917
pixdim3        1.197917
pixdim4        2.200000
cal_max        0.0000
cal_min        0.0000
file_type      NIFTI-1+

Make uncaught exceptions visible on the online validator

We had a few instances when users tried to validate a datasets and waited hours for the spinning wheel to go away not aware that the validator crashed (for example #107). It would be a better user experience if uncaught validator exceptions were displayed on the page. This most likely applies to the use of the validator on CRN.

gulp build is currently broken

events.js:141
throw er; // Unhandled 'error' event
^
Error
at new JS_Parse_Error (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :1526:18)
at js_error (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :1534:11)
at croak (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2026:9)
at token_error (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2034:9)
at expect_token (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2047:9)
at expect (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2050:36)
at regular_for (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2280:9)
at for_ (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2276:16)
at eval (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2158:24)
at eval (eval at (/home/ubuntu/bids-validator/node_modules/gulp-uglify/node_modules/uglify-js/tools/node.js:22:1), :2073:24)

gulp build returned exit code 1

Deploy web version of the validator in github gh-pages

If we deploy (in an automated way through circleCI) the web version of the BIDS validator be able to quickly test it outside of other projects (for example CRN). This way we will be also able to provide validation service to people who do not (yet!) intend to share their data.

The user interface can be very basic since the main goal is to expose/test the underlying JS library.

Considers suggest metadata filename invalid

/sub-04/ses-r08/func/sub-04_ses-r08_task-orientation_rec-dico_run-01_bold.nii.gz
        You should define 'EchoTime' for this file. If you don't provide this information field map correction will not be possible. It can be included one of the following locations: /sub-04/ses-r08/func/sub-04_ses-r08_task-orientation_rec-dico_run-01_bold.json, /sub-04/ses-r08/sub-04_ses-r08_task-orientation_rec-dico_bold.json, /sub-04/sub-04_task-orientation_rec-dico_bold.json, /task-orientation_rec-dico_bold.json

but at the same time

/sub-04/ses-r08/sub-04_ses-r08_task-orientation_rec-dico_bold.json
        This file is not part of the BIDS specification, make sure it isn't included in the dataset by accident. Data derivatives (processed data) should be placed in /derivatives folder.

Support Policy (browser & node)

Now that there are more of us contributing to the validator and it's starting to be tested/used more broadly we should figure out what versions of node and browsers we want to officially support. Outside of those we can decide on a case by case basis whether it's worth it to add support.

Node

This seems relatively straightforward. We can continue targeting the most recent LTS version of node. Right now that is v4.4.1. We can regularly update the version of node in Circle CI to match this to get day to day test coverage on this version.

Potential Conflicts
  • Users who are unable to update their system to the latest LTS version of node due to policies or access limitations where they're working.

Browser

This may also be straightforward. The web validator currently only supports chrome. This is because chrome is the only browser to support selecting a full directory into a file select input. If we only target support for Chrome there should only be a handful of language features node supports, but Chrome doesn't (block scoping, etc.). We can also be relatively sure users have an updated version because Chrome auto updates.

Potential Conflicts
  • People making applications that reconstruct a dataset in the browser from a database and then run validation on it.
  • People not using the the core BIDS validation function and instead using the validations that can run on single files such as testing an individual .tsv file.

Recommendation

It seems like we can go with latest LTS version in Node and the latest Chrome. Let me know if any of the above conflicts are serious enough to consider or if I'm missing other reasons to broaden support.

Too strict for beginners

It seems that if the validator encounters anything that it doesn't know in quickTest it calls it a day, screams "this is not a BIDS dataset" and goes home crying ;-)

It would be nice to have it 1) be more verbose about the reason for rejection, and 2) be more tolerant for extra directories that are present, but do not interfere with stuff that is declared in BIDS. It only acknowledges derivatives and leaves it alone. I think a warning about unknown content is more appropriate than a flat out rejection.

Trying to work around this, but my javascript foo is not worth the label.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.