Git Product home page Git Product logo

spatialdm's People

Contributors

huangyh09 avatar jackywangtj66 avatar leeyoyohku avatar shoufachen avatar wyq977 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

spatialdm's Issues

time consuming about permutation!

adata_sub is about 4k cells and 25k genes,
sdm.spatialdm_global(adata_sub, n_perm = 1000, specified_ind=None, method="z-score", nproc=6) cost about 2 minutes,
however,
sdm.spatialdm_global(adata_sub, n_perm = 1000, specified_ind=None, method="permutation", nproc=6) cost about 2 hours!

The process of 'Spatial Clustering of Local Spots' using SpatialDE is taking an excessive amount of time. Could I have overlooked some parameters?

I ran the 'Spatial Clustering of Local Spots' step using the following code:

from threadpoolctl import threadpool_limits
with threadpool_limits(limits=10, user_api='blas'):
    results = SpatialDE.run(adata.obsm['spatial'], bin_spots.transpose())
    histology_results, patterns = SpatialDE.aeh.spatial_patterns(
        adata.obsm['spatial'], bin_spots.transpose(), results, C=3, l=3, verbosity=1)

For 7,000 spots, it took approximately 4 hours. However, for 30,000 spots, it has been running for more than 48 hours and hasn't finished yet. Does anyone have insights into why it takes so long, or did I miss some parameters?

Load 10xgenomics Visium data using scanpy

Hello,

Thank you for sharing the package.

I am trying to load some Visium data following the tutorial. In there, I see that the data is loaded as adata1 = sdm.datasets.dataset.melanoma(), I thought that I could just do adata = sc.read_visium("visium_ov_orig") to load my own files(using a Visium h5 file, this can be replicated with this file ), but then, I am getting errors in other parts of the tutorial, it starts failing when I call:
sdm.extract_lr(adata, 'human', min_cell=3)
with the error:
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Is there a different way to load this data? I tried checking the codebase and couldn't find a method that seemed more suitable.

Thank you

Supp Note 1 and Example

Hi Zhuoxuan Li (@leeyoyohku).

When I am reading the paper on bioarxic, I couldn't find the Supp. Note 1 that contains the derivation for null distribution of Global Moran's R and other methodology. Can you send me a pdf or update the manuscript? That would be really helpful.

BTW, when I was running the examples (3 jupyter notebook), NaiveDE and SpatialDE needs to installed.

It would be helpful that you could add a short note for the requirement (just for the example not the requirement for the package)

e.g.

To run the examples notebooks, NaiveDE and SpatialDE needs to be installed.

NaiveDE

# latest version
pip install -U git+https://github.com/Teichlab/NaiveDE
# or pypi
# pip install NaiveDE

SpatialDE

pip install spatialde

hilearn

# latest version
pip install -U git+https://github.com/huangyh09/hilearn
# or pypi
# pip install hilearn

Code Update Request: Avoid 'np.float' Deprecation Error in melanoma.ipynb

Thank you for creating such a wonderful software!

I immediately installed SpatialDM and tried running melanoma.ipynb, but I got an error.
https://github.com/StatBiomed/SpatialDM/blob/main/tutorial/melanoma.ipynb

Here is the code cell where the error occurred:

import time
start = time.time()
sdm.spatialdm_global(adata, 1000, specified_ind=None, method='both', nproc=1)     # global Moran selection
sdm.sig_pairs(adata, method='permutation', fdr=True, threshold=0.1)     # select significant pairs
print("%.3f seconds" %(time.time()-start))

The error is as follows:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[8], line 3
      1 import time
      2 start = time.time()
----> 3 sdm.spatialdm_global(adata, 1000, specified_ind=None, method='both', nproc=1)     # global Moran selection
      4 sdm.sig_pairs(adata, method='permutation', fdr=True, threshold=0.1)     # select significant pairs
      5 print("%.3f seconds" %(time.time()-start))

File /opt/conda/lib/python3.11/site-packages/spatialdm/main.py:205, in spatialdm_global(adata, n_perm, specified_ind, method, nproc)
    202     raise ValueError("Only one of ['z-score', 'both', 'permutation'] is supported")
    204 with threadpool_limits(limits=nproc, user_api='blas'):
--> 205     pair_selection_matrix(adata, n_perm, specified_ind, method)
    207 adata.uns['global_res'] = pd.concat((adata.uns['ligand'], adata.uns['receptor']),axis=1)
    208 # adata.uns['global_res'].columns = ['Ligand1', 'Ligand2', 'Ligand3', 'Receptor1', 'Receptor2', 'Receptor3', 'Receptor4']

File /opt/conda/lib/python3.11/site-packages/spatialdm/utils.py:114, in pair_selection_matrix(adata, n_perm, sel_ind, method)
    112 # averaged ligand values
    113 L1 = [pd.Series(x[0]).dropna().values for x in ligand.values]
--> 114 L_mat = [adata[:, L1[l]].X.astype(np.float)[:, 0] for l in range(len(L1))]
    115 for i, k in enumerate(ligand.index):
    116     if len(ligand.loc[k].dropna()) > 1:

File /opt/conda/lib/python3.11/site-packages/spatialdm/utils.py:114, in <listcomp>(.0)
    112 # averaged ligand values
    113 L1 = [pd.Series(x[0]).dropna().values for x in ligand.values]
--> 114 L_mat = [adata[:, L1[l]].X.astype(np.float)[:, 0] for l in range(len(L1))]
    115 for i, k in enumerate(ligand.index):
    116     if len(ligand.loc[k].dropna()) > 1:

File /opt/conda/lib/python3.11/site-packages/numpy/__init__.py:305, in __getattr__(attr)
    300     warnings.warn(
    301         f"In the future `np.{attr}` will be defined as the "
    302         "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
    304 if attr in __former_attrs__:
--> 305     raise AttributeError(__former_attrs__[attr])
    307 # Importing Tester requires importing all of UnitTest which is not a
    308 # cheap import Since it is mainly used in test suits, we lazy import it
    309 # here to save on the order of 10 ms of import time for most users
    310 #
    311 # The previous way Tester was imported also had a side effect of adding
    312 # the full `numpy.testing` namespace
    313 if attr == 'testing':

AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

My environment is

  • Python 3.11.4
  • numpy 1.24.4

So, I think the above issue would occur when users use numpy >1.20.

I would appreciate it if you could update the code to avoid deprecation concerning np.float.

Load 10xgenomics Visium data using scanpy

Hello,

Thank you for sharing the package.

I am trying to load some Visium data following the tutorial. In there, I see that the data is loaded as adata1 = sdm.datasets.dataset.melanoma(), I thought that I could just do adata = sc.read_visium("visium_ov_orig") to load my own files(using a Visium h5 file, this can be replicated with this file ), but then, I am getting errors in other parts of the tutorial, it starts failing when I call:
sdm.extract_lr(adata, 'human', min_cell=3)
with the error:
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Is there a different way to load this data? I tried checking the codebase and couldn't find a method that seemed more suitable.

Thank you

positive mask commented out + other questions

Hi,

Congrats for the well-timed and efficient tool!

I wanted to ask in a previous version you had a positive local mask implemented, which is now commented out:

#adata.uns['local_z_p'] = np.where(pos.T == False, 1, adata.uns['local_z_p'])

What is the reasoning for this?

Also, were the correlations between analytical and permutation-based local p-values calculated before or after applying the mask?

Thanks in advance for your response!

Daniel

Help with the simulations from the paper

Hi,

Great paper and package. Thanks alot for making the code open source. I am trying to simulate the dataset for benchmarking cell-cell interaction tools following the Figure 1 f,g,h of the paper. I read the an adapted version of SVCA is being used for simulation. However I am unable to find the source code (or scripts for running SVCA) to generate or reproduce the ROC plots from the paper. It would be great to get some pointers to simulate the data as described.

Thanks again,
Best

InvalidIndexError when running sdm.extract_lr(adata, 'human', min_cell=3)

I encountered an error when running sdm.extract_lr(adata, 'human', min_cell=3), following is the detailed issue:

sdm.weight_matrix(adata, l=1.2, cutoff=0.2, single_cell=False) # weight_matrix by rbf kernel
----> 3 sdm.extract_lr(adata, 'human', min_cell=3)      # find overlapping LRs from CellChatDB
      5 sdm.spatialdm_global(adata, 1000, specified_ind=None, method='both', nproc=1)     # global Moran selection
      6 sdm.sig_pairs(adata, method='permutation', fdr=True, threshold=0.1)     # select significant pairs

File ~\AppData\Roaming\Python\Python39\site-packages\spatialdm\main.py:149, in extract_lr(adata, species, mean, min_cell, datahost)
    147     meanR = gmean(adata[:, receptor[i]].X, axis=1)
    148 else:
--> 149     meanL = adata[:, ligand[i]].X.mean(axis=1)
    150     meanR = adata[:, receptor[i]].X.mean(axis=1)
    151 if (sum(meanL > 0) >= min_cell) * \
    152         (sum(meanR > 0) >= min_cell):

File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\anndata.py:1108, in AnnData.__getitem__(self, index)
   1106 def __getitem__(self, index: Index) -> "AnnData":
   1107     """Returns a sliced view of the object."""
-> 1108     oidx, vidx = self._normalize_indices(index)
   1109     return AnnData(self, oidx=oidx, vidx=vidx, asview=True)

File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\anndata.py:1089, in AnnData._normalize_indices(self, index)
   1088 def _normalize_indices(self, index: Optional[Index]) -> Tuple[slice, slice]:
-> 1089     return _normalize_indices(index, self.obs_names, self.var_names)

File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\index.py:33, in _normalize_indices(index, names0, names1)
     31 ax0, ax1 = unpack_index(index)
     32 ax0 = _normalize_index(ax0, names0)
---> 33 ax1 = _normalize_index(ax1, names1)
     34 return ax0, ax1

File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\index.py:95, in _normalize_index(indexer, index)
     93     return positions  # np.ndarray[int]
     94 else:  # indexer should be string array
---> 95     positions = index.get_indexer(indexer)
     96     if np.any(positions < 0):
     97         not_found = indexer[positions < 0]

File ~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexes\base.py:3904, in Index.get_indexer(self, target, method, limit, tolerance)
   3901 self._check_indexing_method(method, limit, tolerance)
   3903 if not self._index_as_unique:
-> 3904     raise InvalidIndexError(self._requires_unique_msg)
   3906 if len(target) == 0:
   3907     return np.array([], dtype=np.intp)

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Can you help?

spatialdm_global does not work with a list of specified index, pair_selection_matrix would change adata

Hi I found out there's a bug in spatialdm_global

adata.uns['global_stat']['z']['st'] = globle_st_compute(adata)

For example, if there are 400 lr pairs after extract_lr, one select 10 pairs for computation.

User cannot specify specified_ind as in globle_st_compute(adata) would also be still 400 (or 400x400)

That would throw a error in calculation due to

adata.uns['global_stat']['z']['st'].shape != np.zeros(total_len).shape

Also in this function, it would change/remove the ligand, receptor and geneInter in adata. Namely, the changes to adata would NOT be local.

For example, if one only specify 2 pairs, after calculation, ligand, receptor and geneInter would be reduced to 2.

You have to re-run extract_lr to get the full list.

def pair_selection_matrix(adata, n_perm, sel_ind, method):

Question about global permutation test

Hello, I noticed that the package compares the actual global I with the permuted global I's to compute the permutation p-value:

    adata.uns['global_stat']['perm']['global_p'] = 1 - (adata.uns['global_I'] \
                         > adata.uns['global_stat']['perm']['global_perm'].T).sum(axis=0) / n_perm

My question is why do you compare only one direction (">"), i.e., greater than the mean under NULL? Global I can be negative, i.e., less than the mean under NULL indicating negative spatial association.

Please let me know if I misunderstood the package and the paper!

I keep getting error when running spatialdm_local AttributeError: 'NoneType' object has no attribute 'to_adata'

Hi,

I followed the spatial dm example but I keep getting errors when running spatialdm_local.
Below is the code I am running.

start = time.time()
sdm.spatialdm_local(adata, n_perm=1000, method='both', specified_ind=None, nproc=1)     # local spot selection
sdm.sig_spots(adata, method='permutation', fdr=False, threshold=0.1)     # significant local spots
print("%.3f seconds" %(time.time()-start))

And here is the error I am getting.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[47], line 2
      1 start = time.time()
----> 2 sdm.spatialdm_local(adata, n_perm=1000, method='both', specified_ind=None, nproc=1)     # local spot selection
      3 sdm.sig_spots(adata, method='permutation', fdr=False, threshold=0.1)     # significant local spots
      4 print("%.3f seconds" %(time.time()-start))

File /n/data1/mgh/neuro/petti/lab/Users/khan.saad/Conda_envs/python_3.10/lib/python3.10/site-packages/spatialdm/main.py:275, in spatialdm_local(adata, n_perm, method, specified_ind, nproc, scale_X)
    273 ## different approaches
    274 with threadpool_limits(limits=nproc, user_api='blas'):
--> 275     spot_selection_matrix(adata, ligand, receptor, ind, n_perm, method, scale_X)

File /n/data1/mgh/neuro/petti/lab/Users/khan.saad/Conda_envs/python_3.10/lib/python3.10/site-packages/spatialdm/utils.py:179, in spot_selection_matrix(adata, ligand, receptor, ind, n_perm, method, scale_X)
    176 def spot_selection_matrix(adata, ligand, receptor, ind, n_perm, method, scale_X=True):
    177     # local variables (only live in this function scope)
    178     # normalize raw counts
--> 179     raw_norm = adata.raw.to_adata()
    180     raw_norm.X = csr_matrix([norm_max(X) for X in raw_norm.X.T]).T
    181     import scanpy as sc

AttributeError: 'NoneType' object has no attribute 'to_adata'

reading in h5ad file

Hi. I'm trying to read in a h5ad file which was generated from seurat R package. I am using this function: adata = sdm.read_spatialdm_h5ad(filename), but I get KeyError: 'geneInter'. Is this not the right way to read in the data? Thanks.

Bugs discovery for single-cell level spatialDM

Hi, I met a bug when inferring cci in image-based spatial data:

sdm.weight_matrix(adata, l=1.2, cutoff=0.2, single_cell=True) # weight_matrix by rbf kernel

image

Could you please take a look? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.