Git Product home page Git Product logo

st-net's Introduction

Integrating spatial gene expression and breast tumour morphology via deep learning

ST-Net is a machine learning model for predicting spatial transcriptomics measurements from haematoxylin-and-eosin-stained pathology slides. For more details, see the acompanying paper,

Integrating spatial gene expression and breast tumour morphology via deep learning
by Bryan He, Ludvig Bergenstråhle, Linnea Stenbeck, Abubakar Abid, Alma Andersson, Åke Borg, Jonas Maaskola, Joakim Lundeberg & James Zou.
Nature Biomedical Engineering (2020).

Downloading Dataset and Configuring Paths

By default, the raw data must be downloaded from here and placed at data/hist2tscript/. The processed files will then be written to data/hist2tscript-patch/. These locations can be changed by creating a config file (the priority for the filename is stnet.cfg, .stnet.cfg, ~/stnet.cfg, ~/.stnet.cfg). An example config file is given as example.cfg.

Preparing Spatial Data

This code assumes that the raw data has been extracted into SPATIAL_RAW_ROOT as specified by the config file.

python3 -m stnet prepare spatial  # caches the counts and tumor labels into npz files
bin/create_tifs.sh                  # converts jpegs into tiled tif files

Training models

The models for the main results can be trained by running:

ngenes=250
model=densenet121
window=224
for patient in `python3 -m stnet patients`
do
    bin/cross_validate.py output/${model}_${window}/top_${ngenes}/${patient}_ 4 50 ${patient} --lr 1e-6 --window ${window} --model ${model} --pretrain --average --batch 32 --workers 7 --gene_n ${ngenes} --norm
done

To run the comparison for different window sizes:

ngenes=250
model=densenet121
for window in 128 299 512
do
    for patient in `python3 -m stnet patients`
    do
        bin/cross_validate.py output/${model}_${window}/top_${ngenes}/${patient}_ 4 50 ${patient} --lr 1e-6 --window ${window} --model ${model} --pretrain --average --batch 32 --workers 7 --gene_n ${ngenes} --norm
    done
done

To run the comparison for different magnifications:

ngenes=250
window=224
model=densenet121
for downsample in 2 4
do
    for patient in `python3 -m stnet patients`
    do
        bin/cross_validate.py output/${model}_${window}/top_${ngenes}_downsample_${downsample}/${patient}_ 4 50 ${patient} --lr 1e-6 --window ${window} --model ${model} --pretrain --average --batch 32 --workers 4 --gene_n ${ngenes} --norm --downsample ${downsample}
    done
done

To run the comparison against random initialization:

ngenes=250
model=densenet121
window=224
for patient in `python3 -m stnet patients`
do
    bin/cross_validate.py output/${model}_${window}/top_${ngenes}_rand/${patient}_ 4 50 ${patient} --lr 1e-6 --window ${window} --model ${model} --average --batch 32 --workers 7 --gene_n ${ngenes} --norm
done

To run the comparison against individual training of genes:

ngenes=250
model=densenet121
window=224
for i in `seq 10`
do
    ensg=`python3 -m stnet ensg ${i}`
    for patient in `python3 -m stnet patients`
    do
        bin/cross_validate.py output/${model}_${window}/top_${ngenes}_singletask_${i}/${patient}_ 4 50 ${patient} --lr 1e-6 --window ${window} --model ${model} --pretrain --average --batch 32 --workers 7 --gene_list ${ensg} --norm
    done
done

To run the comparison against hand-crafted features:

ngenes=250
window=224
model=rf
for patient in `python3 -m stnet patients`
do
    root=output/${model}_${window}/top_${ngenes}/${patient}_ 
    python3 -m stnet run_spatial --gene --logfile ${root}gene.log --epochs 1 --pred_root ${root} --testpatients ${patient} --window ${window} --model ${model} --batch 32 --workers 7 --gene_n ${ngenes} --norm --cpu
done

Analysis

The main results can be generated by running:

bin/generate_figs.py output/densenet121_224/top_250/ cv

The corresponding results for the comparisons are generated by running:

for i in output/densenet121_224/*; do bin/generate_figs.py $i cv; done

Generating Figures

The following blocks of code are used for generating several of the figures.

Visualization of prediction across whole slide:

bin/visualize.py output/sherlock/BC23450_cv.npz --gene FASN
bin/visualize.py output/sherlock/BC23903_cv.npz --gene FASN

UMAP Clustering:

bin/cluster.py

st-net's People

Contributors

bryanhe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

st-net's Issues

create_tifs.sh requires "histonet"

Hi @bryanhe ,
I had 2 questions

  1. Why did you choose to convert JPEG to TIF tiles, are there any specific reasons why JPG patches should be avoided?
  2. What is "histonet" package? , I couldnt find it on PyPI , could you please give the link to the same?

Thank you

No ensembl.tsv in the utils folder

Hi Bryan,

I am trying to run the code. I download the stent folder and put it into /lib/python3.6/site-packages/ . And I also downloaded the whole folder and run the setup.py to set up the environment. I downloaded the data from https://data.mendeley.com/datasets/29ntw7sh4r/2 which is the breast cancer data. I put the stent.cfg file in my test folder and run python3 -m stnet prepare spatial from the test folder but it shows the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'my environment path/lib/python3.6/site-packages/stnet/utils/ensembl.tsv'

My cfg file looks like this:
SPATIAL_RAW_ROOT = "test folder"/input
SPATIAL_PROCESSED_ROOT = "test folder"/input-patch

And the data in the input folder is BC23209_C1_stdata.tsv HE_BC23209_C1.jpg spots_BC23209_C1.csv.

I am new to python so maybe my question is silly. It would be super helpful if you can tell me where I am wrong. Thanks!

Jiawen

ensembl.tsv

Hello Bryan,

I recently read your work stnet and am very interested in the model. I tried your command "python3 -m stnet prepare spatial" but there is no ensembl.pkl or ensembl.tsv files in the utils folder.

Thanks,
Kenong

Cannot find "_Coords.tsv" file

Hi

I am trying to replicate stnet results on the original data. While running python3 -m stnet prepare spatial the code asks for _coords.tsv file which is not present in the datasets.
Do I need to create it from the metadata or can I download it from some source?

Please help

Thanks

Computer beeps while running cross validation with GPU

Hello everyone!
I have a mysterious problem..
Whenever my colleague performed cross-validation with GUP mode, a beeping noise comes..since this happens only while running cross-validation on GPU mode. It is no issue if he used only the CPU to run this program.

it occurs within 3-5 mins of starting the cross-validation process.
Can somebody help me out here??

This is our workstation configuration:
Processor (AMD RYZEN Threadripper 3rd Gen): 32-Core 3.70 GHz AMD RYZEN Threadripper 3970X 3rd Gen
Air-cooling (CPU)
Memory (DDR4 3200 MHz): 128 GB
GPU Support: 4 GPU-Ready (only for 4x NVIDIA RTX 2080 Ti; 1600W power supply)
Operating System: Ubuntu 20.04
Graphics Card: 2 x NVIDIA RTX A6000 48 GB
NVIDIA NVLink Bridge: 1 x NVLink for 2 x NVIDIA RTX (up to +15% performance)
HDD #1 (Operating System, Applications):1 TB PCI-E 3.0 NVMe SSD
HDD #2:1 TB PCI-E 4.0 NVMe SSD
HDD #3:8 TB HDD
HDD #4:8 TB HDD

Decompression Bomb Warning & KeyError: 'pixel_x'

Not sure if @bryanhe or anyone here has encountered the same problem as me during the pre-processing stage:

While doing command $python3 -m stnet prepare spatial, there is a DecompressionBombWarning following by TypeError as below:

INFO     [08/29 09:46:19] Loading raw data...
100%|██████████████████████████████████████████████████████████████| 68/68 [00:09<00:00,  8.02it/s]
INFO     [08/29 09:46:29] Loading raw data took 9.390724658966064 seconds.
INFO     [08/29 09:46:29] Finding list of genes: 0.22011828422546387
INFO     [08/29 09:46:29] Processing 1 / 23: BT23508
INFO     [08/29 09:46:29] Processing BT23508 D2...
INFO     [08/29 09:46:30] Adding zeros and ordering columns: 1.4410650730133057
INFO     [08/29 09:46:33] Extracting counts: 3.0241270065307617
INFO     [08/29 09:46:34] Extracting tumors: 0.189774751663208
/home/xx/anaconda3/envs/stnet/lib/python3.7/site-packages/PIL/Image.py:2735: DecompressionBombWarning: Image size (92491075 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack.
  DecompressionBombWarning,
INFO     [08/29 09:46:35] Loading image: 1.348219394683838
Traceback (most recent call last):
  File "/home/xx/anaconda3/envs/stnet/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 4380, in get_value
    return libindex.get_value_box(s, key)
  File "pandas/_libs/index.pyx", line 52, in pandas._libs.index.get_value_box
  File "pandas/_libs/index.pyx", line 48, in pandas._libs.index.get_value_at
  File "pandas/_libs/util.pxd", line 113, in pandas._libs.util.get_value_at
  File "pandas/_libs/util.pxd", line 98, in pandas._libs.util.validate_indexer
TypeError: 'str' object cannot be interpreted as an integer

and during handling of the above exception, another exception for KeyError: 'pixel_x' occurred:

Traceback (most recent call last): File "/home/george/anaconda3/envs/stnet/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/george/anaconda3/envs/stnet/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/george/ST-Net-master/stnet/__main__.py", line 8, in <module> stnet.main() File "/home/george/ST-Net-master/stnet/main.py", line 15, in main func(args) File "/home/george/ST-Net-master/stnet/cmd/prepare/spatial.py", line 94, in spatial x = int(round(row["pixel_x"])) File "/home/george/anaconda3/envs/stnet/lib/python3.7/site-packages/pandas/core/series.py", line 868, in __getitem__ result = self.index.get_value(self, key) File "/home/george/anaconda3/envs/stnet/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 4388, in get_value raise e1 File "/home/george/anaconda3/envs/stnet/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 4374, in get_value tz=getattr(series.dtype, 'tz', None)) File "pandas/_libs/index.pyx", line 81, in pandas._libs.index.IndexEngine.get_value File "pandas/_libs/index.pyx", line 89, in pandas._libs.index.IndexEngine.get_value File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'pixel_x'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.