open-minds-lab / mrqa Goto Github PK

View Code? Open in Web Editor NEW

9.0 3.0 6.0 16.42 MB

mrQA: tools for quality assurance in medical imaging datasets, including protocol compliance

Home Page: https://open-minds-lab.github.io/mrQA/

License: Apache License 2.0

Makefile 1.39% Python 93.35% HTML 5.14% Shell 0.13%

qa mri neuroimaging niqc brain ismrm mr-image mri-brain mri-images neuroscience

mrqa's Introduction

mrQA : automatic protocol compliance checks on MR datasets

https://app.codacy.com/project/badge/Grade/8cd263e1eaa0480d8fac50eba0094401

Documentation: https://open-minds-lab.github.io/mrQA/

mrQA is a tool developed for automatic evaluation of protocol compliance in MRI datasets. The tool analyzes MR acquisition data from DICOM headers and compares it against protocol to determine the level of compliance. It takes as input a dataset in DICOM/BIDS format. The tool outputs a compliance report in HTML format, with a percent compliance score for each sequence/modality in a dataset. The tool also outputs a JSON file with the compliance scores for each modality. In addition, it highlights any deviations from the protocol. The tool has been specifically created keeping in mind those who directly acquired the data such as MR Physicists and Technologists, but can be used by anyone who wants to evaluate that MR scans are acquired according to a pre-defined protocol and to minimize errors in acquisition process.

mrQA uses MRDataset to efficiently parse various neuroimaging dataset formats, which is available here.

Key features:

protocol compliance checks in key dimensions, both
- horizontal audit (within-sequence across-dataset), as well as
- vertical audit (within-session across-sequence)
continuous monitoring of incoming data (hourly or daily on XNAT server or similar)
parallel processing of very large datasets (like ABCD or UK Biobank) on a HPC cluster
few more to be released soon including automatic artefact detection and rating

Simple schematic of the library:

mrqa's People

Contributors

Stargazers

Watchers

Forkers

yarikoptic willforan kodiweera raamana jgamblegeorge sinhaharsh

mrqa's Issues

Extract the shim setting values from DICOM header

Shim values are being extracted as tune-up, standard, and advanced. However, they are not very informative. Consider extracting actual Shim values used. For ex. ShimSetting field in JSON sidecar.
"ShimSetting": [ -3141, -7171, 3588, 510, 25, -150, 52, 0 ],
See more information about associated DICOM tags in dcm2niix

Real-time mrQA

Although not urgent but, how can we make these compliance checks realtime ?

We cannot install anything on the main desktop which is connected to the scanner
If scanned data is constantly pushed to a destination folder, mrQA can check it against a reference protocol and send a message to adjacent desktop which can be checked by the technician.

Useful resource : YARRA

ENH : Filepaths to each non-compliant subject and session

Given, a report it is very difficult to know where this non-compliant subject/session is stored on disk. The path is required to address the non-compliance for further analysis. Add a hyperlink to each non_compliant subject name, which points to a text file. The text file would contain complete path on disk

Report : Add Warnings

Add warnings to the report for different scenarios
Could not calculate majority because of (equal count/ less than 3 subjects)

fieldmaps: distinguish _dir-ap and _dir-pa

Per our discussion, mrQA seems to not consider AP/PA pairing in BIDS datasets and then just reports inconsistency in PhaseEncodingDirection somehow, e.g.

Parameter	Ref. Value	Found	Subject_Session
PhaseEncodingDirection	j-,	j,	0026_03, 0026_02, 0026_04, 0026_01, 0117_01, 0112_03, 0112_02, 0112_04, 0112_01, 0055_03, 0055_02, 0055_04, 0055_01, 0002_03, 0002_02, 0002_04, 0002_01, 0080_03, 0080_02, 0080_04, 0080_01, 0101_03, 0101_02, 0101_04, 0101_01, 0019_03, 0019_02, 0019_04, 0019_01, 0032_03, 0032_02, 0032_04, 0032_01, 0076_03, 0076_02, 0076_04, 0076_01, 0119_01, 0059_03, 0059_02, 0059_04, 0059_01, 0036_03, 0036_02, 0036_04, 0036_01, 0060_03, 0060_02, 0060_04, 0060_01, 0104_03, 0104_02, 0104_04, 0104_01, 0115_03, 0115_02, 0115_04, 0115_01, 0095_03, 0095_02, 0095_04, 0095_01, 0127_03, 0127_02, 0127_04, 0127_01, 0018_03, 0018_02, 0018_04, 0018_01, 0031_03, 0031_02, 0031_04, 0031_01, 0057_03, 0057_02, 0057_04, 0057_01, 0001_03, 0001_02, 0001_04, 0001_01, 0091_03, 0091_02, 0091_04, 0091_01, 0077_03, 0077_02, 0077_04, 0077_01, 0109_03, 0109_02, 0109_04, 0109_01, 0030_01, 0086_03, 0086_02, 0086_04, 0086_01, 0132_03, 0132_02, 0132_04, 0132_01, 0028_01, 0089_03, 0089_02, 0089_04, 0089_01, 0014_03, 0014_02, 0014_04, 0014_01, 0065_03, 0065_02, 0065_04, 0065_01, 0007_03, 0007_02, 0007_01, 0011_03, 0011_02, 0011_04, 0011_01, 0062_03, 0062_02, 0062_04, 0062_01, 0129_03, 0129_02, 0129_04, 0129_01, 0111_03, 0111_02, 0111_04, 0111_01, 0107_03, 0107_02, 0107_04, 0107_01, 0122_03, 0122_02, 0122_04, 0122_01, 0079_03, 0079_02, 0079_04, 0079_01, 0083_03, 0083_02, 0083_04, 0083_01, 0120_01, 0092_03, 0092_02, 0092_04, 0092_01, 0133_03, 0133_02, 0133_04, 0133_01, 0074_03, 0074_02, 0074_04, 0074_01, 0020_03, 0020_02, 0020_04, 0020_01, 0016_03, 0016_02, 0016_04, 0016_01, 0075_03, 0075_02, 0075_01, 0123_02, 0123_01, 0106_03, 0106_02, 0106_04, 0106_01, 0070_03, 0070_02, 0070_04, 0070_01, 0024_03, 0024_02, 0024_04, 0024_01, 0038_03, 0038_02, 0038_04, 0038_01, 0064_04, 0064_01, 0081_03, 0081_02, 0081_04, 0081_01, 0009_03, 0009_02, 0009_04, 0009_01, 0087_03, 0087_02, 0087_04, 0087_01, 0047_01, 0003_03, 0003_02, 0003_04, 0003_01, 0044_03, 0044_02, 0044_04, 0044_01, 0043_03, 0043_02, 0043_04, 0043_01, 0066_03, 0066_02, 0066_04, 0066_01, 0008_03, 0008_02, 0008_04, 0008_01, 0097_01, 0037_03, 0037_02, 0037_04, 0037_01, 0094_03, 0094_04, 0094_01, 0025_03, 0025_02, 0025_04, 0025_01, 0098_03, 0098_02, 0098_04, 0098_01, 0021_03, 0021_02, 0021_04, 0021_01, 0131_03, 0131_02, 0131_04, 0131_01, 0029_03, 0029_02, 0029_04, 0029_01, 0105_03, 0105_02, 0105_04, 0051_03, 0051_02, 0051_04, 0051_01, 0099_03, 0099_02, 0099_04, 0099_01, 0114_02, 0114_01, 0061_03, 0061_02, 0061_04, 0061_01, 0035_03, 0035_02, 0035_04, 0035_01, 0023_02, 0023_04, 0023_01, 0017_03, 0017_02, 0017_04, 0017_01, 0058_03, 0058_02, 0058_04, 0058_01, 0102_03, 0102_02, 0102_04, 0102_01, 0013_03, 0013_02, 0013_04, 0013_01, 0040_03, 0040_02, 0040_04, 0040_01, 0124_03, 0124_02, 0124_04, 0124_01, 0046_03, 0046_02, 0046_04, 0126_03, 0126_02, 0126_01, 0088_03, 0088_02, 0088_04, 0088_01, 0050_03, 0050_02, 0050_04, 0050_01, 0034_03, 0034_02, 0034_04, 0034_01, 0090_03, 0090_02, 0090_04, 0090_01, 0130_03, 0130_02, 0130_04, 0130_01, 0052_03, 0052_02, 0052_04, 0052_01, 0078_03, 0078_02, 0078_04, 0078_01, 0100_03, 0100_02, 0100_04, 0100_01, 0068_01, 0041_03, 0041_01, 0010_03, 0010_02, 0010_04, 0010_01, 0128_03, 0128_02, 0128_01, 0053_03, 0053_02, 0053_04, 0053_01, 0093_03, 0093_02, 0093_04, 0093_01, 0073_03, 0073_02, 0073_04, 0073_01, 0116_03, 0116_02, 0116_04, 0116_01, 0069_03, 0069_02, 0069_04, 0069_01, 0084_03, 0084_02, 0084_04, 0084_01, 0006_03, 0006_02, 0006_04, 0006_01, 0118_02, 0118_01, 0103_03, 0103_04, 0103_01, 0033_03, 0033_02, 0033_04, 0033_01, 0004_03, 0004_02, 0004_04, 0004_01, 0015_01, 0063_01, 0056_03, 0056_02, 0056_04, 0056_01, 0005_03, 0005_02, 0005_04, 0005_01, 0085_02, 0085_01, 0082_03, 0082_02, 0082_01, 0071_01, 0039_02, 0039_04, 0039_01,

Filed an issue so I get alerted when it gets fixed ;)

Providing a reference protocol for checking compliance, (and not inferring it from data)

The user should be able to specify a reference protocol for each modality. It should support multi-echo times and values of each parameter. The vendor should also be specified. A single json file may contain multiple protocols and the software should use them to check compliance for corresponding scans.

The json file can also include enhancements such as taking multiple values for a single parameter, or accepting a range of values, or even specifying the tolerance. This JSON file should read into as an instance of a custom class, which should be used to check compliance.

Create an interactive plot

Create a report in which a user can click on plots

Provide a timeline

Add a timeline of non-compliance on the report

Should provide information about when non-compliance occured.
Useful to understand if the recent sessions were non-compliant or it was years before

Analysing compliance w.r.t. vendor, model, and software version

We observed that Siemens scans were much more compliant (> 95 %) than GE and Philips (~75 %). As discussed in faculty meeting, it would be much more informative to this issue builds from vendor or there is some hidden issue. It was suggested to check software versions, model information and site name for each scan.

Site names are not present in DICOM tags. But the software version and model name is present. A preliminary analysis on 375 subjects shows

Siemens had no issues in software versions. All scans were performed with same software version - syngo MR E11 even though two scanner models were used - Prisma and Prisma-Fit.
Philips scans have multiple software versions
5.3.0.0
5.3.1.0
5.3.1.1,
5.3.0.3
5.3.0.0
And two different models were used Achieva dStream, and Ingenia
Similarly, for GE there are multiple versions
27_LX_MR Software release:DV26.0_R01_1725.a,
27_LX_MR Software release:DV26.0_R02_1810.b,
27_LX_MR Software release:DV25.1_R01_1617.b,
25_LX_MR Software release:DV25.0_R02_1549.b,

And there are 2 models - DISCOVERY MR750, Signa Creator

It would be interesting to investigate if these software versions are indeed leading to inconsistencies we observe in parameters.

Keyword in Documentation and the code should be same for correspondence

Keywords like Horizontal/Vertical Audit
Pre/post acquisition check
Criteria - Exact/Range of values for a single parameter

A special vertical audit: ensure multiple runs of the same sequences are compliant

we can even cross-check 3 or more sequences at a time, although this would be based on just one run of those sequences:

for subj, sess, runs, seqs in ds.traverse_vertical_multi(*three_seqs):
    print(f'\n{subj} {sess:3}')
    for rr, ss in zip(runs, seqs):
        print(f'\t{str(ss):>120}\t{rr}')


NDAR_INVVA90J10J 1.3.12.2.1107.5.2.43.167003.30000017062011234526400000025
	                        ABCD-DTI,_SIEMENS,_mosaic,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=4200,TE=89)	1.3.12.2.1107.5.2.43.167003.2017062116561732733026206.0.0.0
	                   ABCD-Diffusion-FM-PA,_SIEMENS,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=12400,TE=89)	1.3.12.2.1107.5.2.43.167003.2017062116542590754325348.0.0.0
	                   ABCD-Diffusion-FM-AP,_SIEMENS,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=12400,TE=89)	1.3.12.2.1107.5.2.43.167003.2017062116554953449925777.0.0.0
NDAR_INVVAGD75XZ 1.3.12.2.1107.5.2.43.166003.30000017070618303923200000004
	                        ABCD-DTI,_SIEMENS,_mosaic,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=4200,TE=89)	1.3.12.2.1107.5.2.43.166003.2017070616245938834992278.0.0.0
	                   ABCD-Diffusion-FM-PA,_SIEMENS,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=12400,TE=89)	1.3.12.2.1107.5.2.43.166003.201707061624039605491420.0.0.0
	                   ABCD-Diffusion-FM-AP,_SIEMENS,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=12400,TE=89)	1.3.12.2.1107.5.2.43.166003.201707061624328940391849.0.0.0
NDAR_INVVA82JDEJ 1.3.12.2.1107.5.2.43.67064.30000017061416254770700000103
	                        ABCD-DTI,_SIEMENS,_mosaic,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=4200,TE=89)	1.3.12.2.1107.5.2.43.67064.2017061815005347256561769.0.0.0
	                   ABCD-Diffusion-FM-PA,_SIEMENS,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=12400,TE=89)	1.3.12.2.1107.5.2.43.67064.2017061814592319229060911.0.0.0
	                   ABCD-Diffusion-FM-AP,_SIEMENS,_original_(baseline_year_1_arm_1)(FA=90,PED=COL,SSEQ=EP,TR=12400,TE=89)	1.3.12.2.1107.5.2.43.67064.2017061815002572536761340.0.0.0

let's try find an Open DICOM dataset

preferably a neuroimaging dataset

incremental mode of operation

Description

We would like to introduce mrQA to our "pipeline" of data acquisition. I wish there was a mode where mrQA could have been used on DICOMs in an incremental fashion. E.g. we collect 1st subject/session, run mrQA which extracts all metadata etc, stores it and produces report on what it can tell up to that point. Then for the next subject/session we point mrQA to prior save extracts and that new acqusition(s) so it could update extracts and the report.

Let me know if you need me to elaborate more on this.

Also carry out "exam" audit?

Not quite sure how to name it in addition to "vertical" and "horizontal" (IMHO an easier to grasp something like "cross-{dimension}" would be better so "cross-site" and "cross-subject" correspondingly; and here "cross-sequence"). To do some consistency analysis under assumptions across different sequences within specific exam.
E.g. correspondence (of shims, geometry, etc) of fieldmaps to func/dwi in BIDS "style" based on their assignment using IntendedFor field in side-car files. Also a check that all func/dwi do have fieldmaps if typically there is a fieldmap.

Use 'acq' and 'task' as a suffix for modality name

Problem: Just using datatype (e.g. anat or func) is not sufficient. It leads to erroneous compliance reports.
Why:

A user may use a custom label to distinguish a different set of parameters for acquiring the same modality. This can be highlighted in filename by the label acq.
Similarly, different tasks in fmri may have different set of acquisition parameters

Therefore, it is not quite right to raise alarms for non-compliance, when the changes in parameters are intended,

Poster Issues

Try if we can also show multiple sites in the same audit figure

Sub-class modality

Inherit modality class to create multi-echo modalities and normal modalities, will have to change code for compliance accordingly

Differentiating between sessions with and without C02

Currently, the series description for MRI sequences acquired with or without CO2 inhalation is identical. This poses a significant challenge in distinguishing between the two sets of sequences when analyzing the data. How can we accurately identify and separate these sequences with CO2 inhalation and without CO2 inhalation?

Potential directions

Look for a DICOM tag
Look for annotations by MR technicians
Anything in private headers?

Provide BIDS-App

So there is a command line tool which satisfies bids-apps interface and then listed on https://bids-apps.neuroimaging.io/apps/ with docker images built and then we would get also absorb it into https://github.com/ReproNim/containers/ datalad dataset.

No DICOMs found

mrQA version: pip current version
Python version: 3.10
Operating System: Centos 7

Description

Siemens enhanced DICOMs without dcm extension doesn't find DICOMS

What I Did

(qctools) [adamraikes@gpu68 allo_phase2]$ mrqa --data-source $PWD/dicoms/000052/Screening/666e4ef2/ --format dicom --name test     
Traceback (most recent call last):
  File "/home/u26/adamraikes/.conda/envs/qctools/bin/mrqa", line 8, in <module>
    sys.exit(main())
  File "/home/u26/adamraikes/.conda/envs/qctools/lib/python3.10/site-packages/mrQA/cli.py", line 85, in main
    check_compliance(dataset=dataset,
  File "/home/u26/adamraikes/.conda/envs/qctools/lib/python3.10/site-packages/mrQA/project.py", line 63, in check_compliance
    raise DatasetEmptyException
MRdataset.config.DatasetEmptyException: Expected Sidecar DICOM/JSON files in --data_source. Got 0 DICOM/JSON files.

--style dicom is not listed as an option

mrQA version:

"impossible" to figure out -- please add --version:

bids@rolando:/inbox/BIDS/Wager/Wager/1076_spacetop$ ~/.local/bin/mr_proto_compl --version
usage: mr_proto_compl -d DATA_ROOT [-o OUTPUT_DIR] [-s STYLE] [-n NAME] [-h] [-r] [-v] [-ref REFERENCE_PATH] [--strategy STRATEGY] [--include_phantom] [--metadata_root METADATA_ROOT] [-l LOGGING] [--skip SKIP [SKIP ...]]
mr_proto_compl: error: the following arguments are required: -d/--data_root

Operating System: singularity reproin container

bids@rolando:/inbox/BIDS/Wager/Wager/1076_spacetop$ ~/.local/bin/mr_proto_compl --help | grep -A1 -e '-s STYLE,'
  -s STYLE, --style STYLE
                        type of dataset, one of [xnat|bids|other]

So - no dicom is given as an option at all.

Moreover that intriguing `other` I am not sure what it is about

bids@rolando:/inbox/BIDS/Wager/Wager/1076_spacetop$ ~/.local/bin/mr_proto_compl --data_root ./sourcedata/sub-0001/ses-01/ --output_dir .heudiconv/mrQA --style other
/home/bids/singularity_home/.local/lib/python3.9/site-packages/mrQA/cli.py:84: UserWarning: Expected a unique identifier for caching data. Got NoneType. Using a random name. Use --name flag for persistent metadata
  dataset = import_dataset(data_root=args.data_root,
Traceback (most recent call last):
  File "/home/bids/singularity_home/.local/bin/mr_proto_compl", line 8, in <module>
    sys.exit(main())
  File "/home/bids/singularity_home/.local/lib/python3.9/site-packages/mrQA/cli.py", line 84, in main
    dataset = import_dataset(data_root=args.data_root,
  File "/home/bids/singularity_home/.local/lib/python3.9/site-packages/MRdataset/base.py", line 81, in import_dataset
    dataset_class = find_dataset_using_style(style.lower())
  File "/home/bids/singularity_home/.local/lib/python3.9/site-packages/MRdataset/base.py", line 113, in find_dataset_using_style
    dataset_lib = importlib.import_module(dataset_modulename)
  File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'MRdataset.other_dataset'

Compliance check on NIfTI properties

Nifti files have a small header and they can be checked for compliance. Add it to bids_dataset compliance check

How to identify sequences in the same session?

the session id (via SeriesNumber) is changing even for different sequences within the same folder (which I assume are coming from the same session). Slack

can we we use these dicom tags instead.
(0008, 0012) Instance Creation Date
(0008, 0013) Instance Creation Time

ENH: Report should include more information about non-compliant values

A report has the list of non-compliant values against the reference value. In case there are lot of different values, it would be helpful to have a min max value included there itself.

Do all sessions/subjects have all the sequences?

Currently, mrQA compares each sequence from a MRI session against the reference protocol to check for non-compliant parameters. But, if a particular sequence was not acquired in a session, we don't raise any warning about it.

It is very important that any missing sequences for a particular subject in a session should be reported.

This issue makes more sense when a reference protocol is provided.
I feel this would raise much more issues, and there should be a flag to disable it in case, we know that MRI dataset we are working with ,is not research-grade.

No DICOMs found (Bruker 7T Preclinical)

mrQA version: pip installed
Python version: 3.10
Operating System: Centos 7

Description

Attempted to run this on mouse DICOMs from a Bruker 70/20 7-Tesla system running Paravision 360 v3.3 (backend software). Outputs include enhanced DICOMs. It appears to look for DICOMs in the right location but then reports no DICOMs

What I Did

(qctools) [adamraikes@gpu68 app_apoe]$ mrqa -d dicoms -f dicom -o mrqa -n compliance -v
2023-10-20 08:11:59,272 - INFO - Created temp file in mrqa
2023-10-20 08:12:24,233 - INFO - Localizer: Skipping /xdisk/adamraikes/app_apoe/dicoms/20230512_114500_apoemouse_APPCA_C1_VM6979_1_11/1/pdata/1/dicom
2023-10-20 08:12:29,049 - INFO - ACR/Phantom: /xdisk/adamraikes/app_apoe/dicoms/20230512_114500_apoemouse_APPCA_C1_VM6979_1_11/5/pdata/1/dicom
DicomDataset compliance is empty.
Traceback (most recent call last):
  File "/home/u26/adamraikes/.conda/envs/qctools/bin/mrqa", line 8, in <module>
    sys.exit(main())
  File "/home/u26/adamraikes/.conda/envs/qctools/lib/python3.10/site-packages/mrQA/cli.py", line 85, in main
    check_compliance(dataset=dataset,
  File "/home/u26/adamraikes/.conda/envs/qctools/lib/python3.10/site-packages/mrQA/project.py", line 63, in check_compliance
    raise DatasetEmptyException
MRdataset.config.DatasetEmptyException: Expected Sidecar DICOM/JSON files in --data_source. Got 0 DICOM/JSON files.

Here's the contents of that folder:

(qctools) [adamraikes@gpu68 app_apoe]$ ls /xdisk/adamraikes/app_apoe/dicoms/20230512_114500_apoemouse_APPCA_C1_VM6979_1_11/5/pdata/1/dicom
4_CIBS_MultishellDWI_EnIm1.dcm

Just as a sanity check:

(qctools) [adamraikes@gpu68 dicom]$ python
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pydicom import dcmread
>>> ds = dcmread('4_CIBS_MultishellDWI_EnIm1.dcm')
>>> ds
Dataset.file_meta -------------------------------
(0002, 0000) File Meta Information Group Length  UL: 196
(0002, 0001) File Meta Information Version       OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID         UI: Enhanced MR Image Storage
(0002, 0003) Media Storage SOP Instance UID      UI: 2.16.756.5.5.200.8323328.195705.1683918037.311.3.0
(0002, 0010) Transfer Syntax UID                 UI: Explicit VR Little Endian
(0002, 0012) Implementation Class UID            UI: 1.2.276.0.7230010.3.0.3.6.6
(0002, 0013) Implementation Version Name         SH: 'OFFIS_DCMTK_366'

....

Additional functionalities

Allowing a range of values for a single parameter
Maybe even more different type of checks
Will be helpful in hierarchical checks (e.g. within modality within session)

Exclude MoCo Series from compliance evaluations

Siemens automatically generates motion corrected series for fMRI sequences. We observe that several fMRI sequences are collected together under a new sequence, called mocoseries. As they are different sequences with different protocols, it shows up in the report as non-compliance. We saw that the reference protocol computed is very similar to fmri-ringrewards.

How can we differentiate between raw formats and derivatives? Is there a DICOM flag?

PyPI - Release

Basic barebones working

document parallel processing capabilities

see #23 (comment)

FYI: bids schema also gets some validations "encoded"

See e.g. https://github.com/bids-standard/bids-specification/tree/master/src/schema/rules/checks

Feel welcome to close upon reading.

Produce compliant subset

a useful feature/script would be to produce subsets of the input dataset that are compliant, to enable users who would like to account for it in their analysis in some manner
the data with MRdataset can easily achieve it, and producing clusters of subject/sessions with identical parameter values.. usually no more than 2 or 3 clusters

Reference: Slack

Non-compliance vs Multiple values for parameter

How to decide if the variance in values is expected? Like a modality may have multiple echo times, but it might also be due to non-compliance.

Use echo-numbers for dicom.
Add option for stratification

Documentation Issues

The deviations are because the dataset is aggregated in this manner, not necessarily an issue of non-compliance, but an uninformed user may not know, which may lead to confounds
Make a small table on categorizing parameters as critical/important. But, how do we do that?

Testing

Create dummy datasets which can be post publicly, and re-run tests

Monitoring Log : delta updates since last compliance check

mrQA monitoring produces a full report on the entire dataset. But sometimes for ongoing projects, it might be useful to have a log, for example

what was the update ?
Are there new subjects ?
Was there a new compliance issue?

If we can reconfigure the monitor monitoring module to take into any previous reports into account to put a little log, that would be useful too.
Reference : Slack

Text-based report on console

Description

Throw the report in text format to the terminal itself — if there is an easy HTML to text export option

Ask for critical / important parameters

Review initial draft

provide single entry CLI mrQA (and "enable" shell completion)

ATM functionality exposed on command line via a collection of scripts

        'console_scripts': [
            'protocol_compliance=mrQA.cli:main',
            'mr_proto_compl=mrQA.cli:main',
            'mrpc_subset=mrQA.run_subset:main'
        ],

which have no common prefix or consistency in naming. Makes it hard to impossible to recall what command I should run for doing mrQA.

Similarly to many other tools (git, datalad, ...) I recommend you to provide a single point of entry mrqa command line tool which would expose those three commands (may be renamed also to become a bit more descriptive).

Creating such interfaces is very easy with click library which we use e.g. for dandi-cli. Here is click documentation on creating such "groups" of CLI: https://click.palletsprojects.com/en/8.1.x/commands/ . See e.g a little "advanced" example to define dandi upload where we also reuse definitions for some options (--instance) https://github.com/dandi/dandi-cli/blob/master/dandi/cli/cmd_upload.py etc.

You could then also benefit from getting shell completion for the script - see https://click.palletsprojects.com/en/8.1.x/shell-completion/#shell-completion and our "helper" to make it easier to activate it: https://github.com/dandi/dandi-cli/blob/master/dandi/cli/cmd_shell_completion.py#L29 .

Add to documentation for mrqa_monitor : What if subjects are deleted?

If MANY subjects are removed, it is better to rerun mrQA

include coil info parameter

we need to add coil info parameter(s) for protocol compliance checks.

Getting info on this is not easy and varies across vendors unfortunately (see here), but for siemens, it seems to be at 0x0051, 0x100F, so it should be easy for us to add.

support some kind of `dicom-archives` style?

In heudiconv, and reproin heuristic in particular we not only convert to BIDS datasets, but also "archive" original DICOMs under the sourcedata/ in mirroring converted to BIDS data hierarchy. See e.g. https://datasets.datalad.org/?dir=/dbic/QA/sourcedata/sub-emmet/ses-20180531/fmap which accompanies nii.gz's in http://datasets.datalad.org/?dir=/dbic/QA/sub-emmet/ses-20180531/fmap .

Since bids analytics are limited to metadata fields extracted, I thought it would have been cool for mrQA to just operate on those original DICOMs which we have. (BTW heudiconv can convert from DICOMs being wrapped in such tarballs -- comes handy) . But I do not think that mrQA is supporting that as part of the dicom style:

bids@rolando:/inbox/BIDS/Wager/Wager/1076_spacetop$ ~/.local/bin/mr_proto_compl --data_root ./sourcedata/sub-0001/ses-01/ --output_dir .heudiconv/mrQA --style dicom
/home/bids/singularity_home/.local/lib/python3.9/site-packages/mrQA/cli.py:84: UserWarning: Expected a unique identifier for caching data. Got NoneType. Using a random name. Use --name flag for persistent metadata
  dataset = import_dataset(data_root=args.data_root,
Traceback (most recent call last):
  File "/home/bids/singularity_home/.local/bin/mr_proto_compl", line 8, in <module>
    sys.exit(main())
  File "/home/bids/singularity_home/.local/lib/python3.9/site-packages/mrQA/cli.py", line 84, in main
    dataset = import_dataset(data_root=args.data_root,
  File "/home/bids/singularity_home/.local/lib/python3.9/site-packages/MRdataset/base.py", line 82, in import_dataset
    dataset = dataset_class(
  File "/home/bids/singularity_home/.local/lib/python3.9/site-packages/MRdataset/dicom_dataset.py", line 70, in __init__
    self.save_dataset()
  File "/home/bids/singularity_home/.local/lib/python3.9/site-packages/MRdataset/base.py", line 351, in save_dataset
    raise EOFError('Dataset is empty!')
EOFError: Dataset is empty!

option to exclude subjects / sessions in the report, when the checks are re-run

in large datasets, the checks maybe run multiple times, so it could be great if there is an option to exclude subjects / sessions in the report, when the checks are re-run to reduce info overload in the html reports

perhaps with a json file of IDs

Different number of sequences for Subjects across Sessions on different dates

Currently, sessions conducted on different dates have different numbers of MRI sequences acquired for each subject. This inconsistency hinders our ability to reconcile and compare the data effectively, as we need a consistent set of sequences across sessions for accurate analysis.

Note, that the difference in number of sequences might be a part of project requirement. But ensuring consistent number of sequences in each session is important to avoid misinterpretation of protocol compliance reports. If neglected, we may end up comparing protocols for sequences which are not meant to be compatible.

For example, the number of sequences acquired with CO2 inhalation > number of sequences acquired without CO2 inhalation. These differences are important, and we cannot use a blanket algorithm to check for protocol compliance

Documentation - Sphinx

Complete comments on code to generate sphinx documentation

Differentiating between Raw/unprocessed scans and processed/derived scans

we need a way to differentiate between raw (EPI) scans and processed data (Motion Corrected by Siemens)... another instance of raw vs. derived data is from ASL / perfusion weighted images also

Add hyperlink for report navigation

Add relative links to modality names in the summary table, which will lead to detailed view

Allow user to specify parameters to check compliance for

As of now, there is a dictionary in MRdataset/config.py which stores the different parameters which are extracted from the DICOM header. Although it is possible to extend the dictionary as per the needs of the user, it would be even better if there is a CLI option to specify what parameters to check compliance for?

Should we include a separate json/yaml file to read the list of parameters?
We would need to DICOM tag associated with the particular parameter to read it from the header. What if the user is not aware of the tag? Can we accommodate it?
How can we incorporate custom tags?

Checking field maps in each session

In our projects, we acquire two field maps in each session, one in the Anterior-Posterior (AP) direction and the other in the Posterior-Anterior (PA) direction. Currently, there is no mechanism to confirm whether both field maps were acquired in each session and whether they have opposite PED.

Add a session-level validation step to verify the acquisition of both field maps for each session.
Develop a mechanism to check the consistency of PED in the acquired field maps.

open-minds-lab / mrqa Goto Github PK

mrqa's Introduction

mrQA : automatic protocol compliance checks on MR datasets

mrqa's People

Contributors

Stargazers

Watchers

Forkers

mrqa's Issues

Description

Description

What I Did

Description

What I Did

Description

Recommend Projects

Recommend Topics

Recommend Org