Git Product home page Git Product logo

radfm's People

Contributors

chaoyi-wu avatar xiaoman-zhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

radfm's Issues

release requirement.txt

Could you please release requirement.txt? I have met quite a lot of problems about preparing environment.

Rad3D Dataset

Hello, I downloaded the radiopaedia data you updated recently and found that the rad3d_train.json seems to be incomplete after line 764425. Could you double check and share the full version with us? Additionally, It seems that all image files whose "image_path" in the json files that have ".npy" as the postfix of their values cannot be found. Can you fix this as well?
Thank you for your great contribution to the community.

It looks like transformers==4.28.1 will introduce weird output

Thanks for your great job! Compared to some works I reproduced, this repo's result looks convinced. But, I met some problems during training with transformers==4.28.1. The situation is that the correct result is produced in transformers==4.28.0.dev0-py3.9.egg while 4.28.1 will come out weird output.
for 4.28.0.dev0-py3.9.egg
image
for 4.28.1
815c519ba73259568a95a42cbad533f
However, if I used 4.28.0.dev0-py3.9.egg , it will throw errors like

ImportError: cannot import name 'strtobool' from 'transformers.utils'

Do you have any idea about this weird output or dependence? Thanks.

Can the model work beyond the 5000 diseases?

Dear Author

Thanks for the exciting work! It's amazing that the model has been trained on and thus might cover over 5000 diseases. I am wondering, since your goal is building a generalist foundation model for radiology, does the model work beyond the 5000 diseases that it is trained on?

For example, i work with a heart disease that is diagnose via ultrasound (which is a modality covered in your model). Can i ask the diagnosis question about this disease to the model?

Thanks!

how to pass multiple images

Can anyone revert a demo code to to send multiple images to prompt. I can only see 1 image and questions
Screenshot 2024-02-21 at 8 46 09 PM

Gender: Male Perianal purulent discharge and tenderness. Please caption this scan with finding and impression.

GPU显存需求

你好!我看论文中提到使用的NVIDIA A100 GPU(80GB)训练模型,每一个GPU上batchsiz为1,请问如果我希望使用自己的3D数据微调所提出的预训练的3D ViT,具体需要多少显存?所需要的显存是不是应该与输入图像的尺寸相关?
我看论文提到的数据预处理是将输入的3D图像最大限制在了(64,256,256)的大小,如果就是(64,256,256)这个尺寸作为输入,需要占用多少显存?

How to do few-shot and COT prompting

Dear Author

Thank you for the exciting work.
I have run your demo and it works. How do i modify your code to do few-shot and COT prompting?

would you be able to provide some example?

change prompt

The work you've done is truly commendable and holds immense potential.

However, I've come across an issue that I'd like to bring to your attention. When using specific queries, particularly those related to medical imaging such as "Please write a report for this X-ray image" or "Please write a radiology report consists of findings that explains this medical scan", the model either provides no response or consistently outputs "MS Plaques".

So what prompt should I use to let this have outputs?

Rad3d data not complete

Hello, I downloaded the radiopaedia data you provided and found that the rad3d_train.json seems not valid. It shows "Unterminated string starting at: line 1704502 column 13 (char 80168752)" when loading it in python. And I checked the json file, it seems to be incomplete after line 1704502. Could you double check and share the full version with us? Thank you very much!

Rad3D dataset

Hi, are there any plans to release the Rad3D dataset?

模型推理相关问题

背景:我的机器有多张GPU显卡,但是每张GPU显卡显存只有16G,目前运行test.py进行模型推理尝试,提示显存不够。
问题1: 该RadFM模型推理所需的GPU显存需要至少多大?例如 N GB
问题2:如果单张GPU显存没有达到N GB,那是否有其他的方案使得能运行模型推理,例如DeepSpeed?能否给具体的方案说明?

Error loading the weights

Thank you for making available your model. Awesome work!

I was trying to load the model and I got the error message (below). I downloaded all files and then I concatenate them doing cat RadFM.z* > model.zip. Then I unzip performing: unzip model.zip and get the file pytorch_model.bin. Am I doing this procedure right? Thank you.

RuntimeError: Error(s) in loading state_dict for MultiLLaMAForCausalLM: Unexpected key(s) in state_dict: "lang_model.model.layers.0.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.1.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.2.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.3.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.4.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.5.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.6.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.7.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.8.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.9.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.10.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.11.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.12.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.13.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.14.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.15.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.16.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.17.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.18.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.19.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.20.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.21.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.22.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.23.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.24.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.25.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.26.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.27.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.28.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.29.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.30.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.31.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.32.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.33.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.34.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.35.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.36.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.37.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.38.self_attn.rotary_emb.inv_freq", "lang_model.model.layers.39.self_attn.rotary_emb.inv_freq", "embedding_layer.bert_model.embeddings.position_ids".

lang_encoder_path

I am in the process of training a model and I have a question regarding the lang_encoder_path. I've placed the pytorch_model.bin file in it. However, I encountered an issue where I received the following message: "Some weights of the model checkpoint at ./Language_files were not used when initializing LlamaForCausalLM."

Could you provide guidance on this matter?

Image Format Inconsistency between the Codes and the Downloaded Disease Diagnosis Datasets

Hi, thanks for sharing such a meaningful work!

I encountered an issue while testing the downloaded disease diagnosis datasets (e.g., Vindr-SpineXR, Vindr-PCXR, and Vindr-Mammo). The image formats in these datasets are ".dicom", whereas the .csv file you uploaded in Hugging Face uses the ".png" format. I have attached a screenshot to illustrate this issue:

image

Given this situation, I kindly request your assistance in providing a straightforward guide or instructions on converting the ".dicom" files to ".png" format. This guidance would be immensely beneficial not only to me but potentially to other users who might encounter the same challenge.

Could this great model be utilized as a pre-trained feature extractor for radiology images without the need for accompanying language inputs?

Dear Chaoyi,

I trust this message finds you well. I recently had the opportunity to delve into your remarkable work, and I must express my admiration for the innovative approach and substantial contributions outlined in your article.

In particular, the integration of language and vision in your proposed large model captured my attention. The versatility showcased in handling combinations of visual images and language questions is indeed impressive. However, my inquiry pertains to the potential applicability of your model as a standalone feature extractor for radiology images.

Given the success of models like CLIP in serving as effective image encoders, I am curious to know if your model, too, could be employed in a similar capacity. Can it be utilized as a pre-trained feature extractor for radiology images without the need for accompanying language inputs? I am interested in understanding the extent to which your model's capabilities extend to image processing tasks in the domain of radiology.

Thank you for your time and consideration. I look forward to gaining insights into this aspect of your work and exploring potential applications in the realm of medical imaging.

What is the shape of the 3D data in your dataset?

Hi Wu,

Thank you for your great contribution to the community. I have a small question about the data shape. I find this operation here, it seems that the raw 3D CT shape is N,C,H,W,D. In image_dict, “image” shape is C,H,W,D after indexing. However, if contain_nan here, the image shape is C,H,W,D, which can not be indexed on image.shape[0]. I don't know if I'm thinking right here. I just want to figure out what the shape of "image" in image_dict is, and whether their shapes are uniform regardless of whether they are 2D images or 3D images.

Looking forward to your reply.
Best regards,
BAI Fan

question about perceiver in code and paper

thank you for your works, but i have a small question about the paper and the code.
in the paper, the perceiver is implemented as
"For the visual encoder, we adopt a 12-layer 3D ViT with 768 feature dimensions and the
perceiver is chosen as 6-layer transformer decoder with the learnable latent array in 32 × 5120 dimension, so
that all images will be embeded as a 32 × 5120 feature embedding after passing visual encoding and perceiver
aggregation."
however in the code, the latent dim is 32x768, and it is extended to 32x5120 after a fc layer, im not quiet understand about the effect of the fc layer.
i think not using 5120 in the perceiver may be the reason of GPU memory limitation, and extend to 5120 in fc layer may be to decode the information in the latent to facilitate processing by subsequent models.
so i want to ask the effect of the fc layer.

Quick_demo does not work

Hi,
thank you all for this awesome work, and also for sharing it!

I tried to run the Quick_demo by following the README instructions:

  1. Downloaded the pytorch_model.bin from [1] and placed it in folder Quick_demo/

  2. Following issue #5, downgraded to transformers 4.28.1

  3. Ran python test.py with a) 4x RTX3090, then b) 8x RTX3090, but got the following error OOM message both times:
    Setup tokenizer
    Finish loading tokenizer
    Setup demo case
    Finish loading demo case
    Setup Model
    Finish loading model
    Traceback (most recent call last):
    File "/storage/homefs/lz20w714/git/radfm/Quick_demo/test.py", line 121, in
    main()
    File "/storage/homefs/lz20w714/git/radfm/Quick_demo/test.py", line 105, in main
    model = model.to('cuda')
    File "/storage/homefs/lz20w714/anaconda3/envs/llmood/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
    File "/storage/homefs/lz20w714/anaconda3/envs/llmood/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
    File "/storage/homefs/lz20w714/anaconda3/envs/llmood/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
    File "/storage/homefs/lz20w714/anaconda3/envs/llmood/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
    [Previous line repeated 3 more times]
    File "/storage/homefs/lz20w714/anaconda3/envs/llmood/lib/python3.9/site-packages/torch/nn/modules/module.py", line 820, in _apply
    param_applied = fn(param)
    File "/storage/homefs/lz20w714/anaconda3/envs/llmood/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 100.00 MiB (GPU 0; 23.70 GiB total capacity; 23.39 GiB already allocated; 10.69 MiB free; 23.39 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Should there be used a quantized version of the model or sth else?

Thanks for your help!

Best,
Lukas

[1] https://huggingface.co/chaoyi-wu/RadFM/blob/main/pytorch_model.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.