fets-ai / challenge Goto Github PK
View Code? Open in Web Editor NEWThe repo for the FeTS Challenge
Home Page: https://www.synapse.org/#!Synapse:syn28546456
The repo for the FeTS Challenge
Home Page: https://www.synapse.org/#!Synapse:syn28546456
Is there any code implementation available for optimisation for task 1 which requires improving the weight aggregation and Is there any code implementation available for evaluation metrics namely Dice Similarity Coefficient (DSC), 95% Hausdorff distance (HD)?
Dear FeTS-AI team,
following the instructions on how to install the infrastructure for Task 1, I encountered the following error for step 8, the pip install step:
~/Challenge/Task_1$ pip install .
Processing ~/Challenge/Task_1
Preparing metadata (setup.py) ... done
Collecting openfl@ git+https://github.com/intel/openfl.git@f4b28d710e2be31cdfa7487fdb4e8cb3a1387a5f (from fets-challenge==2.0)
Cloning https://github.com/intel/openfl.git (to revision f4b28d710e2be31cdfa7487fdb4e8cb3a1387a5f) to /tmp/pip-install-tglnj1g1/openfl_bf417e151ecd4f06983321e60bc4d466
Running command git clone --filter=blob:none --quiet https://github.com/intel/openfl.git /tmp/pip-install-tglnj1g1/openfl_bf417e151ecd4f06983321e60bc4d466
Running command git rev-parse -q --verify 'sha^f4b28d710e2be31cdfa7487fdb4e8cb3a1387a5f'
Running command git fetch -q https://github.com/intel/openfl.git f4b28d710e2be31cdfa7487fdb4e8cb3a1387a5f
Running command git checkout -q f4b28d710e2be31cdfa7487fdb4e8cb3a1387a5f
Resolved https://github.com/intel/openfl.git to commit f4b28d710e2be31cdfa7487fdb4e8cb3a1387a5f
Preparing metadata (setup.py) ... done
Collecting GANDLF@ git+https://github.com/CBICA/GaNDLF.git@e4d0d4bfdf4076130817001a98dfb90189956278 (from fets-challenge==2.0)
Cloning https://github.com/CBICA/GaNDLF.git (to revision e4d0d4bfdf4076130817001a98dfb90189956278) to /tmp/pip-install-tglnj1g1/gandlf_c4f0036a896d455eae2d9f7d2fd57d46
Running command git clone --filter=blob:none --quiet https://github.com/CBICA/GaNDLF.git /tmp/pip-install-tglnj1g1/gandlf_c4f0036a896d455eae2d9f7d2fd57d46
Running command git rev-parse -q --verify 'sha^e4d0d4bfdf4076130817001a98dfb90189956278'
Running command git fetch -q https://github.com/CBICA/GaNDLF.git e4d0d4bfdf4076130817001a98dfb90189956278
Running command git checkout -q e4d0d4bfdf4076130817001a98dfb90189956278
Resolved https://github.com/CBICA/GaNDLF.git to commit e4d0d4bfdf4076130817001a98dfb90189956278
Running command git submodule update --init --recursive -q
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting fets@ git+https://github.com/FETS-AI/Algorithms.git@fets_challenge (from fets-challenge==2.0)
Cloning https://github.com/FETS-AI/Algorithms.git (to revision fets_challenge) to /tmp/pip-install-tglnj1g1/fets_a6232463deb046dfa00c89ef47b688d0
Running command git clone --filter=blob:none --quiet https://github.com/FETS-AI/Algorithms.git /tmp/pip-install-tglnj1g1/fets_a6232463deb046dfa00c89ef47b688d0
Running command git checkout -b fets_challenge --track origin/fets_challenge
Switched to a new branch 'fets_challenge'
Branch 'fets_challenge' set up to track remote branch 'fets_challenge' from 'origin'.
Resolved https://github.com/FETS-AI/Algorithms.git to commit 60e0b8761229edde18e3d707e3e3e5eb0c0fb80f
Running command git submodule update --init --recursive -q
fatal: No url found for submodule path 'GANDLF' in .gitmodules
error: subprocess-exited-with-error
× git submodule update --init --recursive -q did not run successfully.
│ exit code: 128
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× git submodule update --init --recursive -q did not run successfully.
│ exit code: 128
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Looking at Algorithms/.gitmodules, I noticed that the GANDLF submodule has been renamed to GANDLF_module a while back.
I got around this by manually cloning openfl, GANDFL and fets from the repos as listed in Challenge/Task_1/setup.py
within the install_requires
block, checking out the corresponding commits and pip install
ing these into the env. Then, I commented out the install_requires
block and ran pip install .
on Challenge/Task_1
again.
My conda env was then still missing the correct GLIB version, erroring out with libstdc++.so.6: version GLIBCXX_3.4.30' not found
.
conda install -c conda-forge libgcc=5.2.0
fixed this for me.
Lastly, when installing torchvision
, one needs to pin it to an older version, as the newer ones do not provide the required torch.ao
module:
pip install torchvision==0.9.1
So, easy fix. Just leave this here in case someone else stumbles upon this. 🙂
Best,
Manu
Hi all, I am working on Task 1 and when I try to run the baseline code for the very first time, I got this error
"Task_1/fets_challenge/gandlf_csv_adapter.py", line 13, in from fets.data.base_utils import get_appropriate_file_paths_from_subject_dir
ModuleNotFoundError: No module named 'fets'
also pls refer to the following screenshot
BTW, I have already installed openfl.
Thank you very much. Looking forward to your kind helps.
Traceback (most recent call last):
File ".\FeTS_Challenge.py", line 584, in
restore_from_checkpoint_folder = restore_from_checkpoint_folder)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\experiment.py", line 364, in run_challenge_experiment
checkpoint_folder = setup_checkpoint_folder()
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\checkpoint_utils.py", line 18, in setup_checkpoint_folder
checkpoint_num = sorted([int(x.replace('checkpoint/experiment_','')) for x in existing_checkpoints])[-1] + 1
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\checkpoint_utils.py", line 18, in
checkpoint_num = sorted([int(x.replace('checkpoint/experiment_','')) for x in existing_checkpoints])[-1] + 1
ValueError: invalid literal for int() with base 10: 'checkpoint\experiment_1'
I had an error:
No module named 'openfl.federated.data.loader_fets_challenge'
Would anyone help me out? Thank you so much!
Hi,
I am getting this error while on the step on pip install . given in the readme.
I have been trying to solve this for the past 1 week I do not understand the cause for this error.
Please let me know what I might be doing wrong or the changes I need to make, I'm running the exact same code mentioned in the readme even for the venv setup.
Thank you
Thank you for being generous with your time and organizing this challenge in such a polished manner. I followed the instructions exactly as they are detailed in README and went through your notebook, changing only the dataset path to the one corresponding to my own directory where the MICCAI_FeTS2021_TrainingData is stored.
However, when I try to run all cells, I am getting this error at the final cell:
I wonder if anyone else had had this problem so far?
the model are based on ResUnet?I use a ResUnet as baseline and load the weight.The result shows it does's not help at all.
Dear Organizers,
You may consider adding these lines to the internal codes. I think that it is also good for comparing team results at the end of the challenge. But I am not sure that they are enough :)
torch.manual_seed(torch_manual_seed)
torch.cuda.manual_seed_all(torch_manual_seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
(maybe random_state can be fixed for train_test_split )
Best,
Ece
Originally posted by @eceisik in #80 (comment)
what are the aggregator & collaborator functions that you are going to use as baseline?
Hello, I received the following errors when installing OpenFL using pip on windows 10:
ERROR: Could not find a version that satisfies the requirement openfl (from versions: none)
ERROR: No matching distribution found for openfl
my python version is 3.7
Dear Organizers:
Thank you very much for your hard work in organizing this challenge. I have three questions about Task 2:
what is the Cross validation setup? How is the data ratio within the training data used to evaluate and compute the performance metrics? If the data from one colab is used to evaluate after training the Unet from the other colabs in a select epoch or is there any other scheme?
运行./mconfig
后,运行cd builddir
and make
时出现超时。
进入builddir
文件夹
文本打开Makefile
文件
找到并修改修改GOPROXY := https://proxy.golang.org
改为GOPROXY := https://proxy.golang.cn
I install the dependencies according to the Fets_Challenge official guide, but get something wrong:
`No 'TrainOrVal' column found in split_subdirs csv, so performing automated split using percent_train of 0.8
[08:53:18] INFO Updating aggregator.settings.rounds_to_train to 70... native.py:83
INFO Updating aggregator.settings.db_store_rounds to 2... native.py:83
/home/zss/anaconda3/envs/FL/lib/python3.7/site-packages/pandas/core/frame.py:4913: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
errors=errors,
Traceback (most recent call last):
File "fets_challenge_task1.py", line 654, in
device=device)
File "/data1/zsscode/Fong/code/FL/HFL/Task_1/fets_challenge/experiment.py", line 289, in run_challenge_experiment
task_runner = copy(plan).get_task_runner(list(collaborator_data_loaders.values())[0])
File "/home/zss/anaconda3/envs/FL/lib/python3.7/site-packages/openfl/federated/plan/plan.py", line 340, in get_task_runner
self.runner_ = Plan.Build(**defaults)
File "/home/zss/anaconda3/envs/FL/lib/python3.7/site-packages/openfl/federated/plan/plan.py", line 179, in Build
module = import_module(module_path)
File "/home/zss/anaconda3/envs/FL/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "/root/.local/workspace/impl/fets_challenge_model.py", line 53, in
from fets.models.pytorch.brainmage.losses import MCD_loss, MCD_MSE_loss, dice_loss
File "/home/zss/anaconda3/envs/FL/lib/python3.7/site-packages/fets/models/pytorch/brainmage/init.py", line 1, in
from .brainmage import BrainMaGeModel
File "/home/zss/anaconda3/envs/FL/lib/python3.7/site-packages/fets/models/pytorch/brainmage/brainmage.py", line 38, in
from openfl import load_yaml
ImportError: cannot import name 'load_yaml' from 'openfl' (/home/zss/anaconda3/envs/FL/lib/python3.7/site-packages/openfl/init.py)
(FL) root@omnisky:/data1/zsscode/Fong/code/FL/HFL/Task_1# python
Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
from openfl.models.pytorch import PyTorchFLModel
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'openfl.models'
`
and I also try to install from source according to the specified branch, but i get same wrong. Is there anything wrong with my installation process?
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\torch\nn\functional.py", line 2186, in instance_norm
input, weight, bias, running_mean, running_var, use_input_stats, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:75] data. DefaultCPUAllocator: not enough memory: you tried to allocate 67108864 bytes. Buy new RAM!
Hi, I tried using the setup.py
and also independently installing GANDLF with pip,
pip install GANDLF@git+https://github.com/CBICA/GaNDLF.git@e4d0d4bfdf4076130817001a98dfb90189956278
but both get stuck on a git checkout. Was wondering if there's an alternate hash?
git checkout -q e4d0d4bfdf4076130817001a98dfb90189956278
Describe the bug
A clear and concise description of what the bug is.
During step 8 in https://github.com/FeTS-AI/Challenge/tree/main/Task_1 , a openfl installation error occurred while executing 'openfl @ git+https://github.com/intel/openfl.git@771fc05d57612e2fd0f133ee301f5cd9678cf9d9z',
among install_requires in setup.py.
To Reproduce
Steps to reproduce the behavior:
Line 29 in 524d6b9
Line 31 in 524d6b9
Expected behavior
I found an issue with the same bug in the openfl
repository and a commit that improved it.
Please check the link below
securefederatedai/openfl@771fc05
So I took this commit number and modified the path to install openfl in setup.py.
change
Line 31 in 524d6b9
'openfl @ git+https://github.com/intel/openfl.git@771fc05d57612e2fd0f133ee301f5cd9678cf9d9',
Desktop (please complete the following information):
window10
22hz
OpenFL had to be installed independently of Setup.py because OpenFL its self doesn't install on windows
Successfully installed packages from C:\Users\15702\.local\workspace/requirements.txt.
New workspace directory structure:
workspace
├── .workspace
├── agg_to_col_one_signed_cert.zip
├── agg_to_col_two_signed_cert.zip
├── cert
├── data
├── logs
├── partitioning_1.csv
├── partitioning_2.csv
├── plan
│ ├── cols.yaml
│ ├── data.yaml
│ ├── defaults
│ └── plan.yaml
├── requirements.txt
├── save
├── small_split.csv
├── src
│ ├── challenge_assigner.py
│ ├── fets_challenge_model.py
│ └── __init__.py
└── validation.csv
6 directories, 15 files
Setting Up Certificate Authority...
1. Create Root CA
1.1 Create Directories
1.2 Create Database
1.3 Create CA Request and Certificate
2. Create Signing Certificate
2.1 Create Directories
2.2 Create Database
2.3 Create Signing Certificate CSR
2.4 Sign Signing Certificate CSR
3 Create Certificate Chain
Done.
Creating AGGREGATOR certificate key pair with following settings: CN=openvessel.ptd.net, SAN=DNS:openvessel.ptd.net
Writing AGGREGATOR certificate key pair to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/server
The CSR Hash for file server/agg_openvessel.ptd.net.csr = 2dffb3b6b3429066358c48f7817b37def87f94c4b6538a7511d9ec15d3eb64227561744b638709da4b7b3119e3f8062d
Signing AGGREGATOR certificate
Creating COLLABORATOR certificate key pair with following settings: CN=one, SAN=DNS:one
Moving COLLABORATOR certificate to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/col_one
The CSR Hash for file col_one.csr = 788406d2db5277603ff520c9f86237fead29fd890db72e5f01eaec7671b2c92dd13eb69b28eb8f821958e5d67e3c7e2d
Signing COLLABORATOR certificate
Registering odeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert\client\col_one in C:\Users\15702\.local\workspace\plan\cols.yaml
Creating COLLABORATOR certificate key pair with following settings: CN=two, SAN=DNS:two
Moving COLLABORATOR certificate to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/col_two
The CSR Hash for file col_two.csr = cc4568bc2f495a46ceff0f3708b1a24a43ddff4aa6797cfd00eb51abdfa9a1078b8ed349a0fd0fcfb5ccabc9815b7f88
Signing COLLABORATOR certificate
Registering odeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert\client\col_two in C:\Users\15702\.local\workspace\plan\cols.yaml
C:\Users\15702\.local\workspace
No 'TrainOrVal' column found in split_subdirs csv, so performing automated split using percent_train of 0.8
Traceback (most recent call last):
File ".\FeTS_Challenge.py", line 566, in <module>
restore_from_checkpoint_folder = restore_from_checkpoint_folder)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\experiment.py", line 254, in run_challenge_experiment
gandlf_csv_path)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\gandlf_csv_adapter.py", line 147, in construct_fedsim_csv
inner_dict = get_appropriate_file_paths_from_subject_dir(os.path.join(pardir, subdir), include_labels=True)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\fets\data\base_utils.py", line 14, in get_appropriate_file_paths_from_subject_dir
filesInDir = os.listdir(dir_path)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '/raid/datasets/FeTS22/MICCAI_FeTS2022_TrainingData\\FeTS2022_01333'
So the workspace is set up in VScode but experiment.py is not finding /raid/datasets/FeTS22MICCAI_FeTS2022_TrainingData\FeTS2022_01333
OS: WIndows 11
Python 3.7.9
we run python .\FeTS_Challenge.py
we get this error, I understand that we are loading data with the data_loader and its gets a dict with headers built in experiment.py
with task runner
File ".\FeTS_Challenge.py", line 584, in <module>
restore_from_checkpoint_folder = restore_from_checkpoint_folder)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\experiment.py", line 292, in run_challenge_experiment
task_runner = copy(plan).get_task_runner(collaborator_data_loaders[col])
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\openfl\federated\plan\plan.py", line 389, in get_task_runner
self.runner_ = Plan.build(**defaults)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\openfl\federated\plan\plan.py", line 182, in build
instance = getattr(module, class_name)(**settings)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\openfl\federated\task\runner_fets_challenge.py", line 43, in __init__
model, optimizer, train_loader, val_loader, scheduler, params = create_pytorch_objects(fets_config_dict, train_csv=train_csv, val_csv=val_csv, device=device)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\generic.py", line 48, in create_pytorch_objects
train_loader = get_train_loader(parameters)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\data\__init__.py", line 24, in get_train_loader
loader_type="train",
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\data\ImagesFromDataFrame.py", line 65, in ImagesFromDataFrame
preprocessing = parameters["data_preprocessing"]
KeyError: 'data_preprocessing'
we print() out the parameters we can see that "data_preprocessing" is actually missing, is this a breaking change with GANDLF?
{'batch_size': 1, 'data_augmentation': {},
'data_postprocessing': {},
'enable_padding': False,
'in_memory': True,
'inference_mechanism': {'grid_aggregator_overlap': 'crop', 'patch_overlap': 0},
'learning_rate': 0.001,
'loss_function': 'dc',
'medcam_enabled': False,
'metrics': ['dice', 'dice_per_label', 'hd95_per_label'],
'model': {'amp': True, 'architecture': 'resunet', 'base_filters': 32, 'class_list': [0, 1, 2, 4], 'dimension': 3, 'final_layer': 'softmax', 'norm_type': 'instance', 'type': 'torch', 'num_channels': 4, 'num_classes': 4},
'nested_training': {'testing': 1, 'validation': -5},
'num_epochs': 1, 'optimizer': {'type': 'sgd'},
'output_dir': '.', 'parallel_compute_command': '', 'patch_sampler': 'label',
'patch_size': [64, 64, 64], 'patience': 100, 'pin_memory_dataloader': False, 'print_rgb_label_warning': True, 'q_max_length': 100, 'q_num_workers': 0, restore_from_checkpoint_folder = restore_from_checkpoint_folder)
(venv) PS C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1> python .\FeTS_Challenge.py
Creating Workspace Directories
Creating Workspace Templates
Requirement already satisfied: torchvision in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from -r C:\Users\15702\.local\workspace/requirements.txt (line 1)) (0.9.2+cu111)
Requirement already satisfied: torch in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from -r C:\Users\15702\.local\workspace/requirements.txt (line 2)) (1.8.2+cu111)
Requirement already satisfied: numpy in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from torchvision->-r C:\Users\15702\.local\workspace/requirements.txt (line 1)) (1.21.0)
Requirement already satisfied: pillow>=4.1.1 in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from torchvision->-r C:\Users\15702\.local\workspace/requirements.txt (line 1)) (9.1.1)
Requirement already satisfied: typing-extensions in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from torch->-r C:\Users\15702\.local\workspace/requirements.txt (line 2)) (4.2.0)
Successfully installed packages from C:\Users\15702\.local\workspace/requirements.txt.
New workspace directory structure:
workspace
├── .workspace
├── agg_to_col_one_signed_cert.zip
├── agg_to_col_two_signed_cert.zip
├── cert
├── checkpoint
├── data
├── gandlf_paths.csv
├── logs
├── output_validation
│ └── 0
├── partitioning_1.csv
├── partitioning_2.csv
├── plan
│ ├── cols.yaml
│ ├── data.yaml
│ ├── defaults
│ └── plan.yaml
├── raid
│ └── datasets
│ └── FeTS22
├── requirements.txt
├── save
│ └── fets_seg_test_init.pbuf
├── seg_test_train.csv
├── seg_test_val.csv
├── small_split.csv
├── src
│ ├── challenge_assigner.py
│ ├── fets_challenge_model.py
│ ├── __init__.py
│ └── __pycache__
│ ├── challenge_assigner.cpython-37.pyc
│ ├── fets_challenge_model.cpython-37.pyc
│ └── __init__.cpython-37.pyc
└── validation.csv
13 directories, 22 files
Setting Up Certificate Authority...
1. Create Root CA
1.1 Create Directories
1.2 Create Database
1.3 Create CA Request and Certificate
2. Create Signing Certificate
2.1 Create Directories
2.2 Create Database
2.3 Create Signing Certificate CSR
2.4 Sign Signing Certificate CSR
3 Create Certificate Chain
Done.
Creating AGGREGATOR certificate key pair with following settings: CN=openvessel.ptd.net, SAN=DNS:openvessel.ptd.net
Writing AGGREGATOR certificate key pair to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/server
The CSR Hash for file server/agg_openvessel.ptd.net.csr = f713b37863866bd5a82473efd30b8e494ef0243b4470fae2ae40e7d75f5415475f38c91986391d95436bce024df14bf1
Signing AGGREGATOR certificate
Creating COLLABORATOR certificate key pair with following settings: CN=one, SAN=DNS:one
Moving COLLABORATOR certificate to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/col_one
The CSR Hash for file col_one.csr = 58fdc5a503366177f1556335d22295b6d598078341ad3b40ad7301c2cf3dac5252d8feea1f03bb7fa6077b2541562860
Signing COLLABORATOR certificate
Registering odeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert\client\col_one in C:\Users\15702\.local\workspace\plan\cols.yaml
Creating COLLABORATOR certificate key pair with following settings: CN=two, SAN=DNS:two
Moving COLLABORATOR certificate to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/col_two
The CSR Hash for file col_two.csr = 374efb23a8b7af15d53eb824db7136e5996b418c38e9b65a12384788aff27fb0c5d59de2418784030bc3196d4342cf27
Signing COLLABORATOR certificate
Registering odeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert\client\col_two in C:\Users\15702\.local\workspace\plan\cols.yaml
C:\Users\15702\.local\workspace\gandlf_paths.csv
No 'TrainOrVal' column found in split_subdirs csv, so performing automated split using percent_train of 0.8
[]
[]
[]
[20:09:15] INFO Updating aggregator.settings.rounds_to_train to 5... native.py:102
INFO Updating aggregator.settings.db_store_rounds to 5... native.py:102
WARNING Did not find tasks.train.aggregation_type in config. Make sure it should exist. Creating... native.py:105
INFO Updating task_runner.settings.device to cpu... native.py:102
WARNING Did not find task_runner.settings.fets_config_dict.data_preprocessing in config. Make sure it should exist. Creating... native.py:105
WARNING Did not find task_runner.settings.fets_config_dict.ignore_label_validation in config. Make sure it should exist. Creating... native.py:105
INFO FL-Plan hash is 601cd0b67629af4d8ea0527f65b8a6613cc7d60f28d1a035e5167db87264c20e2fc1f2844d0df0c45d72ae1b29dcff48 plan.py:234
{
"aggregator.settings.best_state_path": "save/fets_seg_test_best.pbuf",
"aggregator.settings.db_store_rounds": 2,
"aggregator.settings.init_state_path": "save/fets_seg_test_init.pbuf",
"aggregator.settings.last_state_path": "save/fets_seg_test_last.pbuf",
"aggregator.settings.rounds_to_train": 3,
"aggregator.settings.write_logs": true,
"aggregator.template": "openfl.component.Aggregator",
"assigner.settings.training_tasks.0": "aggregated_model_validation",
"assigner.settings.training_tasks.1": "train",
"assigner.settings.training_tasks.2": "locally_tuned_model_validation",
"assigner.settings.validation_tasks.0": "aggregated_model_validation",
"assigner.template": "src.challenge_assigner.FeTSChallengeAssigner",
"collaborator.settings.db_store_rounds": 1,
"collaborator.settings.delta_updates": false,
"collaborator.settings.opt_treatment": "RESET",
"collaborator.template": "openfl.component.Collaborator",
"compression_pipeline.settings": {},
"compression_pipeline.template": "openfl.pipelines.NoCompressionPipeline",
"data_loader.settings.feature_shape.0": 32,
"data_loader.settings.feature_shape.1": 32,
"data_loader.settings.feature_shape.2": 32,
"data_loader.template": "openfl.federated.data.loader_fets_challenge.FeTSChallengeDataLoaderWrapper",
"network.settings.agg_addr": "openvessel.ptd.net",
"network.settings.agg_port": 54937,
"network.settings.cert_folder": "cert",
"network.settings.client_reconnect_interval": 5,
"network.settings.disable_client_auth": false,
"network.settings.hash_salt": "auto",
"network.settings.tls": true,
"network.template": "openfl.federation.Network",
"task_runner.settings.device": "cpu",
"task_runner.settings.fets_config_dict.batch_size": 1,
"task_runner.settings.fets_config_dict.data_augmentation": {},
"task_runner.settings.fets_config_dict.data_postprocessing": {},
"task_runner.settings.fets_config_dict.enable_padding": false,
"task_runner.settings.fets_config_dict.in_memory": true,
"task_runner.settings.fets_config_dict.inference_mechanism.grid_aggregator_overlap": "crop",
"task_runner.settings.fets_config_dict.inference_mechanism.patch_overlap": 0,
"task_runner.settings.fets_config_dict.learning_rate": 0.001,
"task_runner.settings.fets_config_dict.loss_function": "dc",
"task_runner.settings.fets_config_dict.medcam_enabled": false,
"task_runner.settings.fets_config_dict.metrics.0": "dice",
"task_runner.settings.fets_config_dict.metrics.1": "dice_per_label",
"task_runner.settings.fets_config_dict.metrics.2": "hd95_per_label",
"task_runner.settings.fets_config_dict.model.amp": true,
"task_runner.settings.fets_config_dict.model.architecture": "resunet",
"task_runner.settings.fets_config_dict.model.base_filters": 32,
"task_runner.settings.fets_config_dict.model.class_list.0": 0,
"task_runner.settings.fets_config_dict.model.class_list.1": 1,
"task_runner.settings.fets_config_dict.model.class_list.2": 2,
"task_runner.settings.fets_config_dict.model.class_list.3": 4,
"task_runner.settings.fets_config_dict.model.dimension": 3,
"task_runner.settings.fets_config_dict.model.final_layer": "softmax",
"task_runner.settings.fets_config_dict.model.norm_type": "instance",
"task_runner.settings.fets_config_dict.nested_training.testing": 1,
"task_runner.settings.fets_config_dict.nested_training.validation": -5,
"task_runner.settings.fets_config_dict.num_epochs": 1,
"task_runner.settings.fets_config_dict.optimizer.type": "sgd",
"task_runner.settings.fets_config_dict.output_dir": ".",
"task_runner.settings.fets_config_dict.parallel_compute_command": "",
"task_runner.settings.fets_config_dict.patch_sampler": "label",
"task_runner.settings.fets_config_dict.patch_size.0": 64,
"task_runner.settings.fets_config_dict.patch_size.1": 64,
"task_runner.settings.fets_config_dict.patch_size.2": 64,
"task_runner.settings.fets_config_dict.patience": 100,
"task_runner.settings.fets_config_dict.pin_memory_dataloader": false,
"task_runner.settings.fets_config_dict.print_rgb_label_warning": true,
"task_runner.settings.fets_config_dict.q_max_length": 100,
"task_runner.settings.fets_config_dict.q_num_workers": 0,
"task_runner.settings.fets_config_dict.q_samples_per_volume": 40,
"task_runner.settings.fets_config_dict.q_verbose": false,
"task_runner.settings.fets_config_dict.save_output": false,
"task_runner.settings.fets_config_dict.save_training": false,
"task_runner.settings.fets_config_dict.scaling_factor": 1,
"task_runner.settings.fets_config_dict.scheduler.type": "triangle_modified",
"task_runner.settings.fets_config_dict.track_memory_usage": false,
"task_runner.settings.fets_config_dict.verbose": false,
"task_runner.settings.fets_config_dict.version.maximum": "0.0.14",
"task_runner.settings.fets_config_dict.version.minimum": "0.0.14",
"task_runner.settings.fets_config_dict.weighted_loss": true,
"task_runner.settings.train_csv": "seg_test_train.csv",
"task_runner.settings.val_csv": "seg_test_val.csv",
"task_runner.template": "src.fets_challenge_model.FeTSChallengeModel",
"tasks.aggregated_model_validation.function": "validate",
"tasks.aggregated_model_validation.kwargs.apply": "global",
"tasks.aggregated_model_validation.kwargs.metrics.0": "valid_loss",
"tasks.aggregated_model_validation.kwargs.metrics.1": "valid_dice",
"tasks.locally_tuned_model_validation.function": "validate",
"tasks.locally_tuned_model_validation.kwargs.apply": "local",
"tasks.locally_tuned_model_validation.kwargs.metrics.0": "valid_loss",
"tasks.locally_tuned_model_validation.kwargs.metrics.1": "valid_dice",
"tasks.settings": {},
"tasks.train.function": "train",
"tasks.train.kwargs.epochs": 1,
"tasks.train.kwargs.metrics.0": "loss",
"tasks.train.kwargs.metrics.1": "train_dice"
}
INFO Building 🡆 Object FeTSChallengeDataLoaderWrapper from openfl.federated.data.loader_fets_challenge Module. plan.py:173
INFO Building 🡆 Object FeTSChallengeDataLoaderWrapper from openfl.federated.data.loader_fets_challenge Module. plan.py:173
INFO Building 🡆 Object FeTSChallengeDataLoaderWrapper from openfl.federated.data.loader_fets_challenge Module. plan.py:173
INFO Building 🡆 Object FeTSChallengeModel from src.fets_challenge_model Module. plan.py:173
Constructing queue for train data: 100%|████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 2.10it/s]
Calculating weights
Constructing queue for penalty data: 100%|██████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 2.26it/s]
Looping over training data for penalty calculation: 100%|███████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 1.77it/s]
Constructing queue for validation data: 100%|███████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.38it/s]
All Keys : ['subject_id', '2', 'spacing', '3', '4', '5', 'label', 'path_to_metadata']
Since Device is CPU, Mixed Precision Training is set to False
[20:09:22] INFO Building 🡆 Object FeTSChallengeModel from src.fets_challenge_model Module. plan.py:173
Constructing queue for train data: 100%|████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.71it/s]
Calculating weights
Constructing queue for penalty data: 100%|██████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.66it/s]
Looping over training data for penalty calculation: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.01it/s]
Constructing queue for validation data: 100%|███████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.51it/s]
All Keys : ['subject_id', '2', 'spacing', '3', '4', '5', 'label', 'path_to_metadata']
Since Device is CPU, Mixed Precision Training is set to False
[20:09:25] INFO Building 🡆 Object FeTSChallengeModel from src.fets_challenge_model Module. plan.py:173
Constructing queue for train data: 100%|████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.65it/s]
Calculating weights
Constructing queue for penalty data: 100%|██████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.86it/s]
Looping over training data for penalty calculation: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.10it/s]
Constructing queue for validation data: 100%|███████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.80it/s]
All Keys : ['subject_id', '2', 'spacing', '3', '4', '5', 'label', 'path_to_metadata']
Since Device is CPU, Mixed Precision Training is set to False
Loading pretrained model...
[20:09:28] INFO Building 🡆 Object NoCompressionPipeline from openfl.pipelines Module. plan.py:173
[20:09:29] INFO Creating aggregator... experiment.py:323
INFO Building 🡆 Object FeTSChallengeAssigner from src.challenge_assigner Module. plan.py:173
INFO Building 🡆 Object Aggregator from openfl.component Module. plan.py:173
INFO Creating collaborators... experiment.py:330
INFO Building 🡆 Object Collaborator from openfl.component Module. plan.py:173
INFO Building 🡆 Object Collaborator from openfl.component Module. plan.py:173
INFO Building 🡆 Object Collaborator from openfl.component Module. plan.py:173
INFO Starting experiment experiment.py:338
INFO experiment.py:366
Created experiment folder experiment_1...
INFO Collaborators chosen to train for round 0: experiment.py:403
['1', '2', '3']
INFO Hyper-parameters for round 0: experiment.py:425
learning rate: 5e-05
epochs_per_round: 1
INFO Waiting for tasks... collaborator.py:178
INFO Sending tasks to collaborator 3 for round 0 aggregator.py:312
INFO Received the following tasks: ['aggregated_model_validation', 'train', 'locally_tuned_model_validation'] collaborator.py:168
[20:09:30] INFO Using TaskRunner subclassing API collaborator.py:253
********************
Starting validation :
********************
Looping over validation data: 0%| | 0/1 [00:02<?, ?it/s]
Traceback (most recent call last):
File ".\FeTS_Challenge.py", line 584, in <module>
restore_from_checkpoint_folder = restore_from_checkpoint_folder)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\experiment.py", line 468, in run_challenge_experiment
collaborators[col].run_simulation()
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\openfl\component\collaborator\collaborator.py", line 170, in run_simulation
self.do_task(task, round_number)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\openfl\component\collaborator\collaborator.py", line 259, in do_task
**kwargs)
File "C:\Users\15702\.local\workspace\src\fets_challenge_model.py", line 48, in validate
mode="validation")
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\forward_pass.py", line 276, in validate_network
result = step(model, image, label, params, train=True)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\step.py", line 88, in step
loss, metric_output = get_loss_and_metrics(image, label, output, params)
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\loss_and_metric.py", line 141, in get_loss_and_metrics
metric_function, predicted, ground_truth, params
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\loss_and_metric.py", line 13, in get_metric_output
metric_output = metric_function(predicted, ground_truth, params).detach().cpu()
File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\metrics\segmentation.py", line 42, in multi_class_dice
if i != params["model"]["ignore_label_validation"]:
KeyError: 'ignore_label_validation'
solution override the plan.yaml as shown below set to false
overrides = {
'aggregator.settings.rounds_to_train': rounds_to_train,
'aggregator.settings.db_store_rounds': db_store_rounds,
'tasks.train.aggregation_type': aggregation_wrapper,
'task_runner.settings.device': device,
'task_runner.settings.fets_config_dict.data_preprocessing': {},
'task_runner.settings.fets_config_dict.model.ignore_label_validation': False
}
In many of my tests, validation metrics consume the lion's share of the compute time, resulting in very long experiment runtimes (e.g. a week+ to converge). This makes it hard to iterate on ideas.
I'd like to propose two ideas to address this:
Hey, I'm following the instructions in https://github.com/FETS-AI/Challenge/tree/main/Task_1#getting-started
But at step pip install .
I get this error:
INFO: pip is looking at multiple versions of nnunet to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of medpy to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of batchgenerators to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of fets to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of fets-challenge to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install fets and fets-challenge because these package versions have conflicting dependencies.
The conflict is caused by:
nnunet 1.6.6 depends on torch>=1.6.0a
gandlf 0.0.14.dev0 depends on torch==1.8.2
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
Note that I installed pytorch using this command:
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.