theislab / sccoda Goto Github PK

A Bayesian model for compositional single-cell data analysis

License: BSD 3-Clause "New" or "Revised" License

Jupyter Notebook 81.00% Python 19.00%

tensorflow compositional-changes statistical-analysis single-cell-rna-seq

sccoda's Introduction

scCODA - Single-cell differential composition analysis

Note This implementation is no longer maintained. A new version in Jax is available in pertpy.

For more information and contribution guidelines please visit the associated Github repository: https://github.com/theislab/pertpy

scCODA allows for identification of compositional changes in high-throughput sequencing count data, especially cell compositions from scRNA-seq. It also provides a framework for integration of cell-type annotated data directly from scanpy and other sources. Aside from the scCODA model (Büttner, Ostner et al (2021)), the package also allows the easy application of other differential testing methods.

The statistical methodology and benchmarking performance are described in:

Büttner, Ostner et al (2021). scCODA is A Bayesian model for compositional single-cell data analysis (Nature Communications)

Code for reproducing the analysis from the paper is available here.

For further information on the scCODA package and model, please refer to the documentation and the tutorials.

Installation

Running the package requires a working Python environment (>=3.8).

This package uses the tensorflow (>=2.8) and tensorflow-probability (>=0.16) packages. The GPU computation features of these packages have not been tested with scCODA and are thus not recommended.

To install scCODA via pip, call:

pip install sccoda

To install scCODA from source:

Navigate to the directory that you want to install scCODA in
Clone the repository from Github (https://github.com/theislab/scCODA):

git clone https://github.com/theislab/scCODA
Navigate to the root directory of scCODA:

cd scCODA
Install dependencies::

pip install -r requirements.txt
Install the package:

python setup.py install

Docker container:

We provide a Docker container image for scCODA (https://hub.docker.com/repository/docker/wollmilchsau/scanpy_sccoda).

Usage

Import scCODA in a Python session via:

import sccoda

Tutorials

scCODA provides a number of tutorials for various purposes. Please also visit the documentation for further information on the statistical model, data structure and API.

The "getting started" tutorial provides a quick-start guide for using scCODA.
In the advanced tutorial, options for model specification, diagnostics, and result interpretation are disccussed.
The data import and visualization tutorial focuses on loading data from different sources and visualizing their characteristics.
The tutorial on other methods explains how to apply other methods for differential abundance testing from within scCODA.

sccoda's People

Contributors

Stargazers

Watchers

sccoda's Issues

logfoldchange always -0.430725 for non significant classes

I always get a logfodchange of -0.430725 if the composition change in that celltype was non-significant. Is this expected?

Which tensorflow version to use?

Hi sccoda team,

While trying to also run sccoda on my laptop (MacBook Air M1), and downloading all relevant packages for my project, I've come across a version discrepancy. (Note: when I installed sccoda/tensorflow on an iMac a few weeks back the issue did not arise)

When installing tensorflow:
sccoda 0.1.7 requires numpy>=1.21, but you have numpy 1.19.5 which is incompatible. sccoda 0.1.7 requires tensorflow>=2.8, but you have tensorflow 2.5.0 which is incompatible. sccoda 0.1.7 requires tensorflow-probability>=0.16.0, but you have tensorflow-probability 0.12.0 which is incompatible.

On the description for sccoda installation however, other package versions are described.

This package uses the tensorflow (>=2.4, <2.6) and tensorflow-probability (==0.12) packages.

Which package versions ones should I be using?

Also, I've been having issues importing sccoda into my jupyter notebook, as the kernel stops running. Could that issue be related to the package versions? (Note: I do not have this issue using sccoda in jupyter notebook on an iMac, only on my macbook air)

Thanks!

boostrapping on posterior samples for inclusion prob variance

check literature if this correct

how to assess sccoda results

Hi,
Sccoda is suggesting some cell types to be affected by the inhibitor, what further analysis to perform on sccoda results?

python convention lower case

Can we change the package name to scdcpy to comply with the lowercase python convention?
I.e. import scdcpy instead of import SCDCpy - until we have the final name...

Cannot import sccoda.util.comp_ana

I installed scCODA (pip install sccoda) without any problem/warning and performed the basic analysis following the getting started tutorial, until the model setup and inference. Unfortunately I cannot import sccoda.util.comp_ana function into python. Following the tutorial:

from sccoda.util import cell_composition_data as dat
from sccoda.util import data_visualization as viz
from sccoda.util.comp_ana import comp_ana as mod

doesn't work for my python installation (conda, python 3.9), thus I used

import sccoda.util.cell_composition_data as dat
import sccoda.util.data_visualization as viz

successfully. When I try to import comp_ana using:

import sccoda.util.comp_ana as mod

The kernel crushes. If I import sccoda separately;

import sccoda

And look for the possible functions, but I only can see cell_composition_data and data_visualization, but not comp_ana

comp_ana.py is under the sccoda/util folder where it is supposed to be. I am not sure what exactly the problem is, but somehow comp_ana is 'invisible'.

I have a Macbook pro M1 with 8-core, my environment details are:

anndata     0.7.6
scanpy      1.8.1
sinfo       0.3.4
-----
PIL                 8.2.0
anyio               NA
appnope             0.1.2
argon2              20.1.0
attr                21.2.0
babel               2.9.1
backcall            0.2.0
brotli              NA
cairo               1.19.1
certifi             2021.05.30
cffi                1.14.5
chardet             4.0.0
cloudpickle         1.6.0
cycler              0.10.0
cython_runtime      NA
dask                2021.06.0
dateutil            2.8.1
decorator           4.4.2
fsspec              2021.06.0
google              NA
h5py                3.1.0
idna                2.10
igraph              0.9.6
ipykernel           5.5.5
ipython_genutils    0.2.0
jedi                0.18.0
jinja2              3.0.1
joblib              1.0.1
json5               NA
jsonschema          3.2.0
jupyter_server      1.8.0
jupyterlab_server   2.6.0
kiwisolver          1.3.1
leidenalg           0.8.4
llvmlite            0.36.0
markupsafe          2.0.1
matplotlib          3.4.2
mpl_toolkits        NA
natsort             7.1.1
nbclassic           NA
nbformat            5.1.3
numba               0.53.1
numexpr             2.7.3
numpy               1.19.5
packaging           20.9
pandas              1.2.4
parso               0.8.2
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
prometheus_client   NA
prompt_toolkit      3.0.18
psutil              5.8.0
ptyprocess          0.7.0
pvectorc            NA
pyexpat             NA
pygments            2.9.0
pyparsing           2.4.7
pyrsistent          NA
pytz                2021.1
requests            2.25.1
sccoda              0.1.4
scipy               1.6.2
seaborn             0.11.1
send2trash          NA
six                 1.15.0
sklearn             0.24.2
sniffio             1.2.0
socks               1.7.1
sparse              0.12.0
statsmodels         0.12.2
storemagic          NA
tables              3.6.1
tblib               1.7.0
terminado           0.10.1
texttable           1.6.3
tlz                 0.11.1
toolz               0.11.1
tornado             6.1
traitlets           5.0.5
typing_extensions   NA
urllib3             1.26.5
wcwidth             0.2.5
websocket           0.57.0
yaml                5.4.1
zipp                NA
zmq                 22.1.0
-----
IPython             7.24.1
jupyter_client      6.1.12
jupyter_core        4.7.1
jupyterlab          3.0.16
notebook            6.4.0
-----
Python 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:15) [Clang 11.1.0 ]
macOS-11.2.3-x86_64-i386-64bit
8 logical CPU cores, i386
-----
Session information updated at 2021-09-26 14:07

I have tried this on a linux server as well with identical results.
Thanks for the time

Optimal tuning of parameters for estimation

Hi,

I am running a small chain to infer cell type contributions in a dataset composed by 8 samples, with 14 cell types.
I get the following summary:

Compositional Analysis summary (extended):

Data: 8 samples, 14 cell types
Reference index: 7
Formula: C(condition, Treatment("WT"))
Spike-and-slab threshold: 0.642

MCMC Sampling: Sampled 20000 chain states (5000 burnin samples) in 128.565 sec. Acceptance rate: 57.8%

where the acceptance rate in 57.8%.
How can I make sure that I explored correctly the posterior distribution and I don't need a longer chain?
Is an acceptance rate of 57.8% in line with a correct estimation or are there additional parameters that need to be tuned for this model?

Thanks!
Francesco

which condition scCODA can work with?

HI,
can only external condition should be consider as condition in scCODA model or should also use control-case status as a condition?
can also I use continuous variable such as treatment exposure time as a condition or scCODA only accept categorical variable?

Thank you

import from scanpy

Sorry, I am not sure where to post issues, so this is a duplicate.

Hi,

I am trying to use scdcdm with an existing scanpy object.

First, I tried using scdcdm.util.cell_composition_data.from_scanpy, but it turns out that this function does not return a CompositionalData object. Instead, it returns a np.array with cell counts and a list with covariates, which cannot be directly used as an input to scdcdm.util.comp_ana.CompositionalAnalysis. Then, I tried using scdcdm.util.cell_composition_data.from_scanpy_list, but it returned the following error:

Traceback (most recent call last):

  File "<ipython-input-26-2483c45c611d>", line 3, in <module>
    covariate_key='Condition')

  File "/Applications/python_modules/SCDCdm/SCDCdm_public/scdcdm/util/cell_composition_data.py", line 77, in from_scanpy_list
    covariate_data = covariate_data.append(pd.Series(covs), ignore_index=True)

  File "/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 7116, in append
    other.values.reshape((1, len(other))),

AttributeError: 'Categorical' object has no attribute 'reshape'

Any tips?

est_fdr did not change the result

thanks for the development of the great scCODA.

It may be my own error. I found that the result did not change when I changed the est_fdr. Could anyone help me?

>>> sim_results.set_fdr(est_fdr=0.05)
>>> sim_results.summary()
Compositional Analysis summary:

Data: 23 samples, 2 cell types
Reference index: 0
Formula: Group