statbiomed / spatialdm Goto Github PK
View Code? Open in Web Editor NEWSpatial direct messaging detected by bivariate Moran
Home Page: https://spatialdm.readthedocs.io
License: Apache License 2.0
Spatial direct messaging detected by bivariate Moran
Home Page: https://spatialdm.readthedocs.io
License: Apache License 2.0
adata_sub is about 4k cells and 25k genes,
sdm.spatialdm_global(adata_sub, n_perm = 1000, specified_ind=None, method="z-score", nproc=6) cost about 2 minutes,
however,
sdm.spatialdm_global(adata_sub, n_perm = 1000, specified_ind=None, method="permutation", nproc=6) cost about 2 hours!
I ran the 'Spatial Clustering of Local Spots' step using the following code:
from threadpoolctl import threadpool_limits
with threadpool_limits(limits=10, user_api='blas'):
results = SpatialDE.run(adata.obsm['spatial'], bin_spots.transpose())
histology_results, patterns = SpatialDE.aeh.spatial_patterns(
adata.obsm['spatial'], bin_spots.transpose(), results, C=3, l=3, verbosity=1)
For 7,000 spots, it took approximately 4 hours. However, for 30,000 spots, it has been running for more than 48 hours and hasn't finished yet. Does anyone have insights into why it takes so long, or did I miss some parameters?
Hello,
Thank you for sharing the package.
I am trying to load some Visium data following the tutorial. In there, I see that the data is loaded as adata1 = sdm.datasets.dataset.melanoma()
, I thought that I could just do adata = sc.read_visium("visium_ov_orig")
to load my own files(using a Visium h5 file, this can be replicated with this file ), but then, I am getting errors in other parts of the tutorial, it starts failing when I call:
sdm.extract_lr(adata, 'human', min_cell=3)
with the error:
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
Is there a different way to load this data? I tried checking the codebase and couldn't find a method that seemed more suitable.
Thank you
Hi Zhuoxuan Li (@leeyoyohku).
When I am reading the paper on bioarxic, I couldn't find the Supp. Note 1 that contains the derivation for null distribution of Global Moran's R and other methodology. Can you send me a pdf or update the manuscript? That would be really helpful.
BTW, when I was running the examples (3 jupyter notebook), NaiveDE and SpatialDE needs to installed.
It would be helpful that you could add a short note for the requirement (just for the example not the requirement for the package)
e.g.
To run the examples notebooks, NaiveDE
and SpatialDE
needs to be installed.
NaiveDE
# latest version
pip install -U git+https://github.com/Teichlab/NaiveDE
# or pypi
# pip install NaiveDE
SpatialDE
pip install spatialde
hilearn
# latest version
pip install -U git+https://github.com/huangyh09/hilearn
# or pypi
# pip install hilearn
Thank you for creating such a wonderful software!
I immediately installed SpatialDM and tried running melanoma.ipynb, but I got an error.
https://github.com/StatBiomed/SpatialDM/blob/main/tutorial/melanoma.ipynb
Here is the code cell where the error occurred:
import time
start = time.time()
sdm.spatialdm_global(adata, 1000, specified_ind=None, method='both', nproc=1) # global Moran selection
sdm.sig_pairs(adata, method='permutation', fdr=True, threshold=0.1) # select significant pairs
print("%.3f seconds" %(time.time()-start))
The error is as follows:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[8], line 3
1 import time
2 start = time.time()
----> 3 sdm.spatialdm_global(adata, 1000, specified_ind=None, method='both', nproc=1) # global Moran selection
4 sdm.sig_pairs(adata, method='permutation', fdr=True, threshold=0.1) # select significant pairs
5 print("%.3f seconds" %(time.time()-start))
File /opt/conda/lib/python3.11/site-packages/spatialdm/main.py:205, in spatialdm_global(adata, n_perm, specified_ind, method, nproc)
202 raise ValueError("Only one of ['z-score', 'both', 'permutation'] is supported")
204 with threadpool_limits(limits=nproc, user_api='blas'):
--> 205 pair_selection_matrix(adata, n_perm, specified_ind, method)
207 adata.uns['global_res'] = pd.concat((adata.uns['ligand'], adata.uns['receptor']),axis=1)
208 # adata.uns['global_res'].columns = ['Ligand1', 'Ligand2', 'Ligand3', 'Receptor1', 'Receptor2', 'Receptor3', 'Receptor4']
File /opt/conda/lib/python3.11/site-packages/spatialdm/utils.py:114, in pair_selection_matrix(adata, n_perm, sel_ind, method)
112 # averaged ligand values
113 L1 = [pd.Series(x[0]).dropna().values for x in ligand.values]
--> 114 L_mat = [adata[:, L1[l]].X.astype(np.float)[:, 0] for l in range(len(L1))]
115 for i, k in enumerate(ligand.index):
116 if len(ligand.loc[k].dropna()) > 1:
File /opt/conda/lib/python3.11/site-packages/spatialdm/utils.py:114, in <listcomp>(.0)
112 # averaged ligand values
113 L1 = [pd.Series(x[0]).dropna().values for x in ligand.values]
--> 114 L_mat = [adata[:, L1[l]].X.astype(np.float)[:, 0] for l in range(len(L1))]
115 for i, k in enumerate(ligand.index):
116 if len(ligand.loc[k].dropna()) > 1:
File /opt/conda/lib/python3.11/site-packages/numpy/__init__.py:305, in __getattr__(attr)
300 warnings.warn(
301 f"In the future `np.{attr}` will be defined as the "
302 "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
304 if attr in __former_attrs__:
--> 305 raise AttributeError(__former_attrs__[attr])
307 # Importing Tester requires importing all of UnitTest which is not a
308 # cheap import Since it is mainly used in test suits, we lazy import it
309 # here to save on the order of 10 ms of import time for most users
310 #
311 # The previous way Tester was imported also had a side effect of adding
312 # the full `numpy.testing` namespace
313 if attr == 'testing':
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
My environment is
So, I think the above issue would occur when users use numpy >1.20.
I would appreciate it if you could update the code to avoid deprecation concerning np.float.
Hello,
Thank you for sharing the package.
I am trying to load some Visium data following the tutorial. In there, I see that the data is loaded as adata1 = sdm.datasets.dataset.melanoma()
, I thought that I could just do adata = sc.read_visium("visium_ov_orig")
to load my own files(using a Visium h5 file, this can be replicated with this file ), but then, I am getting errors in other parts of the tutorial, it starts failing when I call:
sdm.extract_lr(adata, 'human', min_cell=3)
with the error:
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
Is there a different way to load this data? I tried checking the codebase and couldn't find a method that seemed more suitable.
Thank you
Hi,
Congrats for the well-timed and efficient tool!
I wanted to ask in a previous version you had a positive local mask implemented, which is now commented out:
Line 191 in 20d28c2
What is the reasoning for this?
Also, were the correlations between analytical and permutation-based local p-values calculated before or after applying the mask?
Thanks in advance for your response!
Daniel
Hi,
Great paper and package. Thanks alot for making the code open source. I am trying to simulate the dataset for benchmarking cell-cell interaction tools following the Figure 1 f,g,h of the paper. I read the an adapted version of SVCA is being used for simulation. However I am unable to find the source code (or scripts for running SVCA) to generate or reproduce the ROC plots from the paper. It would be great to get some pointers to simulate the data as described.
Thanks again,
Best
I encountered an error when running sdm.extract_lr(adata, 'human', min_cell=3), following is the detailed issue:
sdm.weight_matrix(adata, l=1.2, cutoff=0.2, single_cell=False) # weight_matrix by rbf kernel
----> 3 sdm.extract_lr(adata, 'human', min_cell=3) # find overlapping LRs from CellChatDB
5 sdm.spatialdm_global(adata, 1000, specified_ind=None, method='both', nproc=1) # global Moran selection
6 sdm.sig_pairs(adata, method='permutation', fdr=True, threshold=0.1) # select significant pairs
File ~\AppData\Roaming\Python\Python39\site-packages\spatialdm\main.py:149, in extract_lr(adata, species, mean, min_cell, datahost)
147 meanR = gmean(adata[:, receptor[i]].X, axis=1)
148 else:
--> 149 meanL = adata[:, ligand[i]].X.mean(axis=1)
150 meanR = adata[:, receptor[i]].X.mean(axis=1)
151 if (sum(meanL > 0) >= min_cell) * \
152 (sum(meanR > 0) >= min_cell):
File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\anndata.py:1108, in AnnData.__getitem__(self, index)
1106 def __getitem__(self, index: Index) -> "AnnData":
1107 """Returns a sliced view of the object."""
-> 1108 oidx, vidx = self._normalize_indices(index)
1109 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\anndata.py:1089, in AnnData._normalize_indices(self, index)
1088 def _normalize_indices(self, index: Optional[Index]) -> Tuple[slice, slice]:
-> 1089 return _normalize_indices(index, self.obs_names, self.var_names)
File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\index.py:33, in _normalize_indices(index, names0, names1)
31 ax0, ax1 = unpack_index(index)
32 ax0 = _normalize_index(ax0, names0)
---> 33 ax1 = _normalize_index(ax1, names1)
34 return ax0, ax1
File ~\AppData\Roaming\Python\Python39\site-packages\anndata\_core\index.py:95, in _normalize_index(indexer, index)
93 return positions # np.ndarray[int]
94 else: # indexer should be string array
---> 95 positions = index.get_indexer(indexer)
96 if np.any(positions < 0):
97 not_found = indexer[positions < 0]
File ~\AppData\Roaming\Python\Python39\site-packages\pandas\core\indexes\base.py:3904, in Index.get_indexer(self, target, method, limit, tolerance)
3901 self._check_indexing_method(method, limit, tolerance)
3903 if not self._index_as_unique:
-> 3904 raise InvalidIndexError(self._requires_unique_msg)
3906 if len(target) == 0:
3907 return np.array([], dtype=np.intp)
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
Can you help?
Hi I found out there's a bug in spatialdm_global
Line 155 in aa17624
For example, if there are 400 lr pairs after extract_lr
, one select 10 pairs for computation.
User cannot specify specified_ind as in globle_st_compute(adata) would also be still 400 (or 400x400)
That would throw a error in calculation due to
adata.uns['global_stat']['z']['st'].shape != np.zeros(total_len).shape
Also in this function, it would change/remove the ligand
, receptor
and geneInter
in adata
. Namely, the changes to adata would NOT be local.
For example, if one only specify 2 pairs, after calculation, ligand
, receptor
and geneInter
would be reduced to 2.
You have to re-run extract_lr
to get the full list.
Line 92 in aa17624
Hello, I noticed that the package compares the actual global I with the permuted global I's to compute the permutation p-value:
adata.uns['global_stat']['perm']['global_p'] = 1 - (adata.uns['global_I'] \
> adata.uns['global_stat']['perm']['global_perm'].T).sum(axis=0) / n_perm
My question is why do you compare only one direction (">"), i.e., greater than the mean under NULL? Global I can be negative, i.e., less than the mean under NULL indicating negative spatial association.
Please let me know if I misunderstood the package and the paper!
Hi, I am wondering why there are two parameters for nearest neighbours in weight matrix calculation.
Line 34 in aa17624
It seems to me that there is no need for both or am I missing sth?
Best
Hi,
I followed the spatial dm example but I keep getting errors when running spatialdm_local.
Below is the code I am running.
start = time.time()
sdm.spatialdm_local(adata, n_perm=1000, method='both', specified_ind=None, nproc=1) # local spot selection
sdm.sig_spots(adata, method='permutation', fdr=False, threshold=0.1) # significant local spots
print("%.3f seconds" %(time.time()-start))
And here is the error I am getting.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[47], line 2
1 start = time.time()
----> 2 sdm.spatialdm_local(adata, n_perm=1000, method='both', specified_ind=None, nproc=1) # local spot selection
3 sdm.sig_spots(adata, method='permutation', fdr=False, threshold=0.1) # significant local spots
4 print("%.3f seconds" %(time.time()-start))
File /n/data1/mgh/neuro/petti/lab/Users/khan.saad/Conda_envs/python_3.10/lib/python3.10/site-packages/spatialdm/main.py:275, in spatialdm_local(adata, n_perm, method, specified_ind, nproc, scale_X)
273 ## different approaches
274 with threadpool_limits(limits=nproc, user_api='blas'):
--> 275 spot_selection_matrix(adata, ligand, receptor, ind, n_perm, method, scale_X)
File /n/data1/mgh/neuro/petti/lab/Users/khan.saad/Conda_envs/python_3.10/lib/python3.10/site-packages/spatialdm/utils.py:179, in spot_selection_matrix(adata, ligand, receptor, ind, n_perm, method, scale_X)
176 def spot_selection_matrix(adata, ligand, receptor, ind, n_perm, method, scale_X=True):
177 # local variables (only live in this function scope)
178 # normalize raw counts
--> 179 raw_norm = adata.raw.to_adata()
180 raw_norm.X = csr_matrix([norm_max(X) for X in raw_norm.X.T]).T
181 import scanpy as sc
AttributeError: 'NoneType' object has no attribute 'to_adata'
Hi. I'm trying to read in a h5ad file which was generated from seurat R package. I am using this function: adata = sdm.read_spatialdm_h5ad(filename), but I get KeyError: 'geneInter'. Is this not the right way to read in the data? Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.