ml-jku / deeprc Goto Github PK
View Code? Open in Web Editor NEWDeepRC: Immune repertoire classification with attention-based deep massive multiple instance learning
DeepRC: Immune repertoire classification with attention-based deep massive multiple instance learning
Hi,
I was just going through the code in architectures.py
and the paper side-by-side. I can't seem to find the query*key operation in the code. As I understand it, this should happen in AttentionNetwork
. From what I see, this is "attention SNN" from Figure 2 in the paper, followed by a linear layer that computes attention weights straight from the keys?
Please let me know if I misunderstood something here, from the paper I assumed that a query*key
operation must be performed there.
Unfortunately, in the paper I did not understand if I can make a training dataset from BCR AIRR seq. Could you clarify in the readme and/or manuscript if it is possible and if yes if any additional steps (in comparison with TCR-s) are needed
At the moment the library uses old torch version that forces the user to use python 3.7, it will be nice to update to a newer one. I would also consider publishing for conda.
Hi I am trying to train a model using toturial dataset, but only on CPU without GPU. I use the [example_single_task_cnn.py]:
train(model, task_definition=task_definition, trainingset_dataloader=trainingset,
... trainingset_eval_dataloader=trainingset_eval, learning_rate=args.learning_rate,
... early_stopping_target_id='binary_target_1', # Get model that performs best for this task
... validationset_eval_dataloader=validationset_eval, n_updates=args.n_updates, evaluate_at=args.evaluate_at,
... device=device, results_directory="/users/sli1/deeprc_result/",show_progress=True)
Saving checkpoint to memory... done!
Training model...
loss= nan: 0%| | 0/1000 [00:00<?, ?it/s]
Saving checkpoint to file... done!
Loading checkpoint from memory "0"... done!
Saving checkpoint to file... done!
Finished Training!
However, I get the following error.
Traceback (most recent call last):
File "", line 5, in
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/deeprc/training.py", line 282, in train
raise e
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/deeprc/training.py", line 193, in train
labels, inputs, sequence_lengths, counts_per_sequence)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/deeprc/architectures.py", line 375, in reduce_and_stack_minibatch
in zip(inputs_list, sequence_lengths)]))
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/deeprc/architectures.py", line 374, in
for inp, sequence_lengths
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/deeprc/architectures.py", line 511, in reduce_sequences_for_bag
emb_seqs = self.sequence_embedding(inputs_mb, sequence_lengths=sequence_lengths_mb).to(dtype=torch.float32)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/deeprc/architectures.py", line 84, in forward
conv_acts = self.network(inputs)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/users/sli1/.conda/envs/deeprc/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 298, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: "unfolded2d_copy" not implemented for 'Half'
May I learn how should I fix this.
The conda
install directions should use:
conda env create -f condal_install.yml --name deeprdc_env
I tried to run the "example_single_task_cnn.py" with the data provided in the "example_dataset" folder and got a ROC_AUC of 0.5, bacc of 0.5, and f1_score of 0.0. Obviously, this is not reasonable, but I don't know what's wrong. Thanks.
I'm trying to run the examples as in README, but I continue getting the error:
KeyError: "Samples ['' '' '' ... '' '' ''] could not be found in hdf5 file. Please add the samples and re-create the hdf5 file or remove the sample keys from the used samples of the metadata file."
This happens with all the datasets I tried in the "Training DeepRC on pre-defined datasets" section. Instead, when I try running the code in the "Training DeepRC on a custom dataset" section, on the example dataset deeprc/datasets/example_dataset, I get:
ValueError: not enough values to unpack (expected 6, got 0)
The versions of the libraries I'm using are the same as yours, what am I missing?
I ran the code cmv_with_implanted_signals, but it downloaded wrong tsv and hdf5 files. I open the link in the browser and it says that your link has expired. Can u update your link and code? Thanks
Hi:
I am wondering how to extract the attention weights for sequence in a sample. In this way we may rank the sequence based on their importance. Thanks! Supposed I train a model named model based on code in example_single_task_cnn.py
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.