dendroulab / panpipes Goto Github PK
View Code? Open in Web Editor NEWMulti-modal single cell analysis pipelines
License: BSD 3-Clause "New" or "Revised" License
Multi-modal single cell analysis pipelines
License: BSD 3-Clause "New" or "Revised" License
Hi,
a small detail I noticed in panpipes_clustering:
When calculating neighbors (meaning, use_existing=False), the pipeline only recalculates the PCA if not present, and not the LSI (Please see: https://github.com/DendrouLab/panpipes/blob/main/panpipes/python_scripts/rerun_find_neighbors_for_clustering.py#L53).
It is possible to run neighbors on LSI, but only if the LSI is already present in the object. Otherwise, an error is thrown: ValueError: Did not find X_lsi in .obsm.keys()
. You need to compute it first. in: https://github.com/DendrouLab/panpipes/blob/main/panpipes/funcs/scmethods.py#L267
A small enhancement could be to also recalculate the LSI if dim_red: X_lsi and it is not present in the object (the same way it's already done for PCA).
hi,
Currently PCA on the ADT modality , if clr
and dsb
is run, is always run on dsb. Perhaps this can be parametrised, so the user can decide if they want PCA on clr or dsb. If this is in-convinient then this should be made clearer in the pipeline.yml, so users are aware that when running dsb, pca is always based on dsb. As this also affects downstream tasks.
Secondly, based on the recent single cell best practices book , removing the isotypes when doing dim-reduction might be an a sensisble choice, since not everyone might want to do this, perhaps this choice can also be paramterised and be an option in the pipeline.yml for the panpies_preprocess workflow
best,
Devika
pipeline_ingest.concat_filtered_mudatas part of the pipeline throws an error because of missing pytz.
ERROR main control -
Original exception:
Exception #1
'builtins.OSError(Job 29542491 has non-zero exitStatus 1: hasExited=True, wasAborted=FalsehasSignal=False, terminatedSignal=''
Traceback (most recent call last):
File "[path]/envs/panpipes/lib/python3.9/site-packages/panpipes/python_scripts/concat_adata.py", line 1, in
import scanpy as sc
File "[path]/envs/panpipes2/lib/python3.9/site-packages/scanpy/init.py", line 6, in
from ._utils import check_versions
File "[path]/envs/panpipes2/lib/python3.9/site-packages/scanpy/_utils/init.py", line 21, in
from anndata import AnnData, version as anndata_version
File "[path]/envs/panpipes2/lib/python3.9/site-packages/anndata/init.py", line 7, in
from ._core.anndata import AnnData
File "[path]/envs/panpipes2/lib/python3.9/site-packages/anndata/_core/anndata.py", line 21, in
import pandas as pd
File "[path]/envs/panpipes2/lib/python3.9/site-packages/pandas/init.py", line 16, in
raise ImportError(
ImportError: Unable to import required dependencies:
pytz: No module named 'pytz'
)' raised in ...
Task = def pipeline_ingest.concat_filtered_mudatas(...):
Traceback (most recent call last):
File "[path]/envs/panpipes/lib/python3.9/site-packages/ruffus/task.py", line 712, in run_pooled_job_without_exceptions
return_value = job_wrapper(params, user_defined_work_func,
File "[path]/envs/panpipes/lib/python3.9/site-packages/ruffus/task.py", line 545, in job_wrapper_io_files
ret_val = user_defined_work_func(*params)
File "[path]/envs/panpipes/lib/python3.9/site-packages/panpipes/panpipes/pipeline_ingest.py", line 178, in concat_filtered_mudatas
P.run(cmd, **job_kwargs)
File "[path]/envs/panpipes/lib/python3.9/site-packages/cgatcore/pipeline/execution.py", line 1244, in run
benchmark_data = r.run(statement_list)
File "[path]/envs/panpipes/lib/python3.9/site-packages/cgatcore/pipeline/execution.py", line 820, in run
stdout, stderr, resource_usage = self.queue_manager.collect_single_job_from_cluster(
File "[path]/envs/panpipes/lib/python3.9/site-packages/cgatcore/pipeline/cluster.py", line 145, in collect_single_job_from_cluster
raise OSError(error_msg)
OSError: Job 29542491 has non-zero exitStatus 1: hasExited=True, wasAborted=FalsehasSignal=False, terminatedSignal=''
Traceback (most recent call last):
File "[path]/envs/panpipes/lib/python3.9/site-packages/panpipes/python_scripts/concat_adata.py", line 1, in
import scanpy as sc
File "[path]/envs/panpipes2/lib/python3.9/site-packages/scanpy/init.py", line 6, in
from ._utils import check_versions
File "[path]/envs/panpipes2/lib/python3.9/site-packages/scanpy/_utils/init.py", line 21, in
from anndata import AnnData, version as anndata_version
File "[path]/envs/panpipes2/lib/python3.9/site-packages/anndata/init.py", line 7, in
from ._core.anndata import AnnData
File "[path]/envs/panpipes2/lib/python3.9/site-packages/anndata/_core/anndata.py", line 21, in
import pandas as pd
File "[path]/envs/panpipes2/lib/python3.9/site-packages/pandas/init.py", line 16, in
raise ImportError(
ImportError: Unable to import required dependencies:
pytz: No module named 'pytz'
\
I removed lines of the log containing individual .h5mu files to protect patient information and replaced my cluster paths with [path]
. After conda install -c anaconda pytz
the error persists.
Hi,
I have noticed that violin plots are no longer being for the data. The files are generated and there is a border but i dont actually see any violin plots being plotted. This has only started happening recently.
Best,
Devika
Hi
I have installed panpipes using a python venv for the Oxford BMRC cluster.
modules i have loaded are
Python/3.10.4-GCCcore-11.3.0 R-bundle-Bioconductor/3.15-foss-2022a-R-4.2.1
I am using muon version 0.1.5, mudata 0.2.3 and mofapy2 0.7.0.
the training model converges, but the pipeline fails with this error (attached screenshot)
Thanks,
Devika
Hi,
I noticed that the preprocessing/QC part of the pipeline doesn't provide a plot that could guide the decision on whether to exclude the first LSI component or not.
The signac package provides a plot of the correlation between the sequencing depth and the components: https://stuartlab.org/signac/reference/depthcor. Including this plot in the pipeline may be a nice extension.
These lines are unnecessary:
And the yml for pipeline_preprocess is wrong about what files that have already been filtered is wrong
The correct thing to do is name your file {sample_prefix}.h5mu
message came up while running find cluster markers. @crichgriffin check installation requirements please!
File "/Users/fabiola.curion/Documents/devel/github/panpipes/panpipes/python_scripts/run_find_markers_multi.py", line 213, in <module>
main(adata,
File "/Users/fabiola.curion/Documents/devel/github/panpipes/panpipes/python_scripts/run_find_markers_multi.py", line 183, in main
with pd.ExcelWriter(excel_file_top) as writer:
File "/Users/fabiola.curion/Documents/devel/miniconda3/envs/pipeline_bbknn/lib/python3.9/site-packages/pandas/io/excel/_openpyxl.py", line 56, in __init__
from openpyxl.workbook import Workbook
ModuleNotFoundError: No module named 'openpyxl'
hi ,
Currently the documentation here : https://panpipes-pipelines.readthedocs.io/en/latest/workflows/qc.html
says panpipes ingestion
and panpipes ingestion config
, panpipes ingestion make full
. We tried running this today and it dint work. it only worked with panpipes ingest make full
. It would be good to know what is that we want this to be, is the documentaiton wrong, or does the code need to be updated
Devika
check downsample background in dsb script
Originally posted by @bio-la in #6 (review)
in the current version of the integration
, if wnn is run on no-batch corrected modalities, it will run neighbours on each modality on the flight in a "no_batch" way (i.e. on precomputed dimred such as PCA or LSI if specified) with the same param as specified for each of the no_batch unimodal analyses.
it's a different behaviour when wnn is calc on pre-batch corrected unimodal data, cause in that case the pipeline expects each batch corrected object to exist and it's correctly reflected in the decorators flow.
we need to modify wnn to fetch precomputed no_batch instead of running on the flight to reduce the runtime (currently runs nobatch twice per modality if wnn is called on no_batch)
I am aware that pip install .
does not work as intended, fixing it will require a resturcture of the repo.
Current alternative method of installation is as follows
pip install -r requirements_minimal.txt
Rscript r_install_libraries.R
python setup_orig.py develop
Hopefully this will get fixed up in the next couple of days!
currently in branch fc_namescheck
Hello! Am having some issues with preprocessing atac data (paired multiome). I am trying to perform preprocessing to be able to run harmony for batch correction.
The pipeline.yml settings are as follows:
atac:
binarize: False
normalize: log1p
Arguments appear to be read in correctly when running the pipeline:
pid: 45740, system: Linux 3.10.0-1160.62.1.el7.x86_64 #1 SMP Tue Apr 5 16:57:59 UTC 2022 x86_64
2023-09-01 17:42:35,606 INFO main control - atac : {'binarize': False, 'normalize': 'log1p', 'TFIDF_flavour': None, 'feature_selection_flavour': 'scanpy', 'min_mean': None, 'max_mean': None, 'min_disp': None, 'min_cutoff': None, 'dimred': 'PCA', 'dim_remove': None} \
atac_TFIDF_flavour : None \
atac_binarize : False \
atac_dim_remove : None \
atac_dimred : PCA \
atac_feature_selection_flavour : scanpy \
atac_normalize : log1p \
But I still have the preprocess log outputs as:
2023-09-01 18:06:29,095: INFO - running with args:
2023-09-01 18:14:14,192: INFO - binarizing peak count matrix
2023-09-01 18:14:15,416: WARNING - Careful, you have decided to binarize data but also to normalize per cell and log1p. Not sure this is meaningful
2023-09-01 18:14:45,824: WARNING - You have 8984 Highly Variable Features
2023-09-01 18:38:51,939: INFO - Done
Is there any other variable causing the atac processing to default to binarizing?
Thank you!
hiya!
i noticed this while looking at plots for the different covariates across the multiple batch correction methods, when running integration workflow from panpipes. The same colours are not use to depict the same legend categories (in my case i noticed it for VDJ receptor subtypes) across all the methods, and when facet plots are created, the order of the headings is different for each method. The latter isnt a problem as such, but does make it difficult to compare across methods in a facet plot. Not sure if the first plottign issue happpens for all covariates or not, or only certain types. Thought i would flag the issue.
best,
Devika
Hi,
while running the pipeline (the QC) on my local machine (Linux) for the first time, I encountered the following error:
/bin/bash: line 1: time: command not found
I solved the issue by following: "https://superuser.com/questions/418325/sh-time-command-not-found" and installing the "time" package on my Linux machine.
Would be good to add the "time" package to the requirements or to mention it as required.
Best,
Sarah
By the end of the second step from the installation guide an error"getting requirements to build wheel did not run successfully" pops up. The underlying problem seems to be downloading pysam package, which from what I have browsed is only a problem for windows.
Some of the plots in the preprint have not been implemented fully yet.
We should do this at some point.
Clustering: Make it more clear that if you want to subcluster, you need to re run preprocess & integration before clustering.
Repertoire: no panpipes documentation on what gets incorporated. there doesn't seem to be a column for productive sequence or not?
Hi,
again just a few little things I ran into while running the steps Preprocessing, Integration, and Clustering.
Preprocessing:
Integration:
Clustering:
Best,
Sarah
When filtering to keep top hvgs only the outputted h5mu does not contain variables associated with scaling (.var 'std' or 'mean') or PCA (.obsm X_pca), even though other outputs (output_pca.txt.gz, filtered_genes.tsv) indicate these steps are being run:
AnnData object with n_obs × n_vars = 370316 × 61860
obs: 'sample_id', 'doublet_scores', 'predicted_doublets', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'total_counts_hb', 'log1p_total_counts_hb', 'pct_counts_hb', 'total_counts_mt', 'log1p_total_counts_mt', 'pct_counts_mt', 'total_counts_rp', 'log1p_total_counts_rp', 'pct_counts_rp', 'total_counts_ig', 'log1p_total_counts_ig', 'pct_counts_ig', 'MarkersNeutro_score', 'S_score', 'G2M_score', 'batch'
var: 'gene_ids', 'feature_types', 'genome', 'interval', 'hb', 'mt', 'rp', 'ig', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm'
uns: 'hvg', 'log1p'
layers: 'raw_counts'
When not filtering hvgs, this is not an issue.
Hi,
I noticed that Muon's implementation of the LSI for ATAC data doesn't take highly variable features into account.
As far as I see, the function takes the adata.X slot without providing the possibility to first select specific features for the LSI to be run on.
Please see their source code: https://github.com/scverse/muon/blob/master/muon/_atac/tools.py#L28
Also in their documentation, there is no such parameter: https://muon.readthedocs.io/en/latest/api/generated/muon.atac.tl.lsi.html
Meaning, when running LSI in panpipes in run_preprocess_atac.py, it is always run on all the features, even if atac.var["highly_variable"] is defined.
Please correct me if I'm wrong.
Hi,
the following may be possible extensions to panpipes in regards to scATAC-data:
Preprocessing:
Visualization:
Analysis:
Not sure how deep panpipes wants to go on the analysis part, but:
RNA+ATAC:
Hiya
Currently the clustering workflow can calculate prot markers for the different RNA leiden resolutions, however this function never finishes for calculating the prot markers if the number of samples/cell in rna are not the same as the prot modaltiy
Hi,
there are some aspects I've noticed while using the QC+Preprocessing for the first time with a RNA+ATAC multiome dataset (filtered_feature_bc_matrix.h5 file):
Sample submission file: unclear to me what is meant by the cellranger "outs" folder in regards to the keys "cellranger" and "cellranger_multi". What files are expected to be in the outs folder? (The barcodes.tsv, genes.tsv and matrix.mtx f.ex.?)
Regarding the QC_mm gene lists: didn't know before running the pipeline that one has to provide a list & that it's not an option, as the documentation of the gene list formats states "...,the user can provide custom gene lists..."
Regarding the QC pipeline.yml file:
Regarding the output of the QC:
I am ingest workflow in a multiome data. however, the panpipes ingest aborts with an error when it reaches 'run_scanpyQC_atac.py ' script. The error I get is:
sc.pl.violin(atac, qc_vars_plot,
File "/miniconda3/envs/pipeline_env/lib/python3.9/site-packages/scanpy/plotting/_anndata.py", line 795, in violin/seaborn/categorical.py", line 2932, in catplot
g = sns.catplot(
File "
p.plot_violins(
File "/seaborn/categorical.py", line 1153, in plot_violi/seaborn/categorical.py", line 420, in configure
ns
self._configure_legend(ax, legend_artist, common_kws)
File "
legend
handles, _ = ax.get_legend_handles_labels()
AttributeError: 'NoneType' object has no attribute 'get_legend_handles_labels' \
I am using panpipes to analyse a CITE seq data. In my submission file, the 'sample_id' column is 'Sample_587'. When I run the ingest workflow, I specify the 'dsb' normalization. However, the pipeline aborts at 'assess_background.py' script with the following error:
sns.heatmap(plt_df.iloc[1:split_int,:], ax=ax[0])
File "/envs/pipeline_env/lib/python3.9/site-packages/seaborn/matrix.py", line 446, in heatmap /envs/pipeline_env/lib/python3.9/site-packages/seaborn/matrix.py", line 163, in init
plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt,
File "
self._determine_cmap_params(plot_data, vmin, vmax,
File "~/envs/pipeline_env/lib/python3.9/site-packages/seaborn/matrix.py", line 197, in determine_cmap
params
calc_data = plot_data.astype(float).filled(np.nan)
ValueError: could not convert string to float: 'Sample_587' \
Hi,
I ran the visualization part of the pipeline with a muData containing both scRNA- and scATAC- data. (Before the visualization step I ran the QC, Preprocessing, Clustering successfully).
When specifying both categorical & continuous variables for the RNA (not the ATAC, I left the ATAC part empty),
so f.ex. "rna:
- rna:total_counts",
it worked completely fine. But when only wanting to plot categorical variables and leaving the continuous variables empty, errors were thrown and the pipeline stopped. The parameter "continuous_violin" was set to False in both cases.
The errors included:
Error in mutate()
:
ℹ In argument: mod = ifelse(X1 %in% c("rna", "prot", "atac", "rep"), X1, "multimodal")
.
Caused by error in X1 %in% c("rna", "prot", "atac", "rep")
:
! object 'X1' not found
when running:
Rscript /home/sarah/anaconda3/envs/pipeline_env/lib/python3.8/site-packages/panpipes/R_scripts/plot_metrics.R --mtd_object sample1_cell_metadata.tsv --params_yaml pipeline.yml > logs/plot_metrics.log
Do I need to specify the parameters in the pipeline.yml in a specific way so that it works? What do I have to consider when only wanting to plot categorical variables?
Thanks.
Documentation of integration step here in step 2 mentions panpipes integration plot_pcas
task, which doesn't exist in pipeline_integration.py
. The PCA plots are already produced at panpipes preprocess
stage.
We have removed all scib metrics computation from the integration pipeline.
scib metrics were implemented with the scope of evaluating unimodal integration. The use of scib metrics for evaluating multimodal integration and reference mapping has been adopted by the community and can provide useful insights for evaluation of multimodal integration.
However there is currently a lack of benchmarking metrics developed specifically for the evaluation of these tasks, which can result in misleading interpretation of integration results.
We and others in the sc field are currently working on generating ad-hoc benchmarking metrics for these tasks and they will be released in the near future.
Therefore, our aim for the next panpipes release is to:
scib
with the faster scib-metrics package wherever possibleWe have left for now the calculation of scib metrics in the refmap workflow as a legacy example of how these are currently computed, but we will be refactoring them in due time.
If you feel you have ideas on implementing integration and/or refmap benchmarking metrics and want to contribute feel free to reach out!
Dear authors,
First of all, congrats on this amazing project.
I believe that moving dependencies and environmental management from venv
to poetry
(https://python-poetry.org/) may significantly increase the robustness of this tool.
did conda install...had igraph 0.10.2. When I tried to do ingest config, had following error: AttributeError: module 'igraph' has no attribute 'VertexClustering'
Solution: uninstall igraph, pip re-installed igraph-0.10.8. Worked.
dsb does not run when half the samples are rna + adt, half the samples are rna only (and no intersection between rna,adt is taken).
fix is to take the intersection of the background: mu.pp.intersect_obs(mdata_bg) prior to mu.prot.pp.dsb
I got this error when running clustree: Error in check.length(gparname) : 'gpar' element 'lwd' must not be length 0
Solution: R package ggraph must be 2.1.0 (it won't work if ggraph is 2.0.5)
I am running a multiome data and preparing my submission file for the ingest workflow. Specifying the cellranger 'outs' folder as a x_path and 'cellranger' as x_filetype results in an error. However, specifying the complete path i.e 'outs/flitered_feature_bc_matrix.h5' and filetype '10X_h5' solves the issue.
While running 'panpipes ingest make full --local' locally on my computer I receive this error: AttributeError: 'YTick' object has no attribute 'label'.
I guess it has to do something with matplotlib.
Full error code:
Traceback (most recent call last):
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/ruffus/task.py", line 712, in run_pooled_job_without_exceptions
return_value = job_wrapper(params, user_defined_work_func,
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/ruffus/task.py", line 608, in job_wrapper_output_files
job_wrapper_io_files(params, user_defined_work_func, register_cleanup, touch_files_only,
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/ruffus/task.py", line 540, in job_wrapper_io_files
ret_val = user_defined_work_func(*(params[1:]))
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/panpipes/panpipes/pipeline_ingest.py", line 469, in run_dsb_clr
P.run(cmd, **job_kwargs)
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/cgatcore/pipeline/execution.py", line 1244, in run
benchmark_data = r.run(statement_list)
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/cgatcore/pipeline/execution.py", line 1029, in run
raise OSError(
OSError: ---------------------------------------
Child was terminated by signal -1:
The stderr was:
/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/scvi/_settings.py:63: UserWarning: Since v1.0.0, scvi-tools no longer uses a random seed by default. Run `scvi.settings.seed = 0` to reproduce results from previous versions.
self.seed = seed
/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/scvi/_settings.py:70: UserWarning: Setting `dl_pin_memory_gpu_training` is deprecated in v1.0 and will be removed in v1.1. Please pass in `pin_memory` to the data loaders instead.
self.dl_pin_memory_gpu_training = (
/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/muon/_prot/preproc.py:219: UserWarning: adata.X is sparse but not in CSC format. Converting to CSC.
warn("adata.X is sparse but not in CSC format. Converting to CSC.")
Traceback (most recent call last):
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/panpipes/python_scripts/run_preprocess_prot.py", line 144, in <module>
pnp.plotting.ridgeplot(mdata["prot"], features=plot_features, layer="clr", splitplot=6)
File "/Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/panpipes/funcs/plotting.py", line 299, in ridgeplot
tick.label.set_fontsize(10)
AttributeError: 'YTick' object has no attribute 'label'
python /Users/justina/opt/anaconda3/envs/multiome_panpipes/lib/python3.9/site-packages/panpipes/python_scripts/run_preprocess_prot.py --filtered_mudata test_unfilt.h5mu --figpath ./figures/prot --channel_col sample_id --normalisation_methods clr --quantile_clipping True --clr_margin 0 > logs/run_dsb_clr.log
Matplotlib version: 3.8.0
Typo in step 6 of Steps to run here: https://panpipes-pipelines.readthedocs.io/en/latest/workflows/integration.html
Change panpipes integration make merge_batch_correction
to panpipes integration make merge_integration
hi,
I am getting an error, because panpipes qc_mm cannot find the script run_scanpyQC_rep.py
. This is because in the python folder the script has been named differently i.e. run_scanpyQC_REP.py
, than how it is called in pipeline_qc_mm.py on line [472] .
I have made the necessary amendments locally for my needs , but thought to mention it here.
Devika
running this selection on an atac object that doesn't have "gene_ids" column doesn't work.
@SarahOuologuem can it be substituted with "features" instead? (the function should test if "gene_ids" is present in var_names otherwise use features or else issue warning and automatically set hvf selection to "scanpy")
features n_cells_by_counts mean_counts pct_dropout_by_counts total_counts
chr1-9962-10510 chr1-9962-10510 12 0.005464 99.453552 12.0
chr1-180614-181999 chr1-180614-181999 65 0.031876 97.040073 70.0
chr1-191356-191736 chr1-191356-191736 3 0.001366 99.863388 3.0
chr1-267811-268201 chr1-267811-268201 13 0.005920 99.408015 13.0
chr1-586031-586368 chr1-586031-586368 3 0.001366 99.863388 3.0
... ... ... ... ... ...
KI270727.1-52104-52803 KI270727.1-52104-52803 59 0.028689 97.313297 63.0
KI270728.1-232459-232988 KI270728.1-232459-232988 6 0.002732 99.726776 6.0
KI270728.1-1791305-1792428 KI270728.1-1791305-1792428 9 0.005009 99.590164 11.0
KI270734.1-117216-117331 KI270734.1-117216-117331 5 0.002277 99.772313 5.0
KI270734.1-133749-134116 KI270734.1-133749-134116 8 0.004098 99.635701 9.0
thank you!
Currently yml says that bbknn works for protein, but isn't in the ruffus pipe.
Also we don't parse any parameters and only use the defaults.
I am using panpipes to analyze CITEseq data, and managed to run the ingest workflow and get the resulting h5mu file. However, when I try to load the resulting 'x_unfilt.h5mu' file in a jupyter notebook using 'muon.read_h5mu()', I get an error that says: ' 'TypeError: init_from_dict() got an unexpected keyword argument 'matrix'.
In case of small number of samples and when number of features is 50, panpipes preprocess
incorrectly establishes the number of PCAs that should be calculated:
Changing this line to n_comps=min(50,all_mdata['prot'].var.shape[0]-1, all_mdata['prot'].var.shape[1]-1)
fixes the issue.
I also suggest considering changing the solver to auto
below a certain threshold of cells, as it's more robust (but slower).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.