Comments (25)
hi @saramoein372, that error indicates you may have an incompatible rpy2/r setup - it's trying to open R twice or it cannot find R?.
I'm not sure what's the best way for you to sort that out but the work around for now is to save the airr file out:
# in python
# write out the airr file
vdj.write_airr('airr.tsv')
and then you load up airr.tsv
in R/Rstudio separately, and follow the tutorial in: https://shazam.readthedocs.io/en/stable/vignettes/DistToNearest-Vignette/
when you eventually get to the steps with says:
# in R
# Find threshold using density method
output <- findThreshold(dist_ham$dist_nearest, method="density")
threshold <- output@threshold
threshold
Note down the threshold value is what you supply to the vdj object:
vdj.threshold = threshold_value
Note that the threshold value is only used for ddl.tl.define_clones
function, which is only a partial wrapper for the actual DefineClones.py
script used in changeo
. So you could follow changeo
's tutorial on how they define clones, and then feed the processed file back to dandelion
for the other bits.
from dandelion.
Thanks Kelvin.
I also have another question about
sc.set_figure_params(figsize = [4,4])
ddl.pl.clone_network(adata,
color = ['sampleid'],
edges_width = 1,
size = 20)
I get an error:
KeyError: "Could not find entry in obsm
for 'vdj'.\nAvailable keys are: ['X_pca', 'X_umap']."
What is the solution for this error? Thank you.
from dandelion.
KeyError: "Could not find entry in
obsm
for 'vdj'.\nAvailable keys are: ['X_pca', 'X_umap']."
You need to run ddl.tl.generate_network(vdj)
and ddl.tl.transfer(adata, vdj)
first.
from dandelion.
Thanks Kelvin.
Sorry if this is not a related question; however I have many troubles for Shazam installation on R. In different ways I tried to fix my installation, but still could not solve it.
The last error I am getting is:
...............................................
library(shazam)
Error: package or namespace load failed for ‘shazam’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘GenomicAlignments’
................................................
Do you have any thoughts or comments in this regard? I know it is not related to dandelion package, but thought maybe you had experienced the same troubles with /shazam" installation.
Thanks,
Sara
from dandelion.
no worries. You need to looks like you need to do BiocManager::install('GenomicAlignments')
for this.
from dandelion.
Thanks Kelvin for your helps. Still getting error. Since morning I am installing this package!
Again error:
library(shazam)
Error: package or namespace load failed for ‘shazam’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘GenomicAlignments’
May I ask what R version are you using?
Thanks,
Sara
from dandelion.
I use R 4.0.3. You might need to update your R version if you using <4
from dandelion.
Thanks Kelvin.
I have another question: after assigning vdj.threshold = 0.27, can I immediately run "ddl.tl.generate_network(vdj)"?
I tried running that but I get the error "ValueError: key sequence_alignment_aa not found in input table."
Is it because I have not run these commands first "ddl.pp.calculate_threshold(vdj, manual_threshold = 0.1)
ddl.tl.define_clones(vdj, key_added = 'changeo_clone_id')"?
from dandelion.
Is it because I have not run these commands first "ddl.pp.calculate_threshold(vdj, manual_threshold = 0.1)
ddl.tl.define_clones(vdj, key_added = 'changeo_clone_id')"?
Partly yes. For ddl.tl.generate_network
to work, it needs the clone_id
and a couple of other columns in the object. You can refer to https://sc-dandelion.readthedocs.io/en/latest/modules/dandelion.tools.generate_network.html for more details.
I tried running that but I get the error "ValueError: key sequence_alignment_aa not found in input table."
For this, did you go through the entire preprocess workflow outline in the tutorial? Areyou analyzing TCR or BCR? There's a bug in the version (0.1.9) on pypi in that the aa columns were not included after the TCR realignment step but this is corrected in the master version here on github. If you reinstall the master version, or use the singularity container to handle the preprocessing, that should hopefully get rid of the issue. Alternatively, you can supply anotherkey
to the function, like sequence_alignment
, sequence
, junction
etc.
from dandelion.
Thanks Kelvin.
I have BCR. I did the preprocessing. I have the filtered_contig_igblast_db-pass_genotyped.tsv.
Then from which step I should use this file for my analysis? I think starting from some steps, I am using the wrong tsv file.
Should I use filtered_contig_igblast_db-pass_genotyped.tsv for all my analysis instead of "airr_rearrangement.tsv"?
Thanks,
Sara
from dandelion.
yes use filtered_contig_igblast_db-pass_genotyped.tsv
.
airr_rearrangement.tsv
is from 10x's cellranger output and by default it's missing a lot of the columns that dandelion uses as default. it's still usable but would require changing the default options. you can look at the API page for more information: https://sc-dandelion.readthedocs.io/en/latest/api.html
from dandelion.
Thanks Kevin.
I could pass that part. Currently, I am running the Interoperability with scirpy.
Now I am blocked at this step:
irdatax = ddl.to_scirpy(vdj, transfer = True)
irdatax
Getting error:
ValueError: edges=True
requires pp.neighbors
to be run before.
Do you have any idea how to solve it?
I really appreciate your time.
Sara
from dandelion.
ValueError:
edges=True
requirespp.neighbors
to be run before.
Something like sc.pp.neighbors(adata)
should work.
from dandelion.
Thank you so much Kelvin. I could run most part of the pipeline in the BCR visualization step. The problem is my clone network plots are all empty!
I think my adata has problem.
For vdj, i used:
file_location = '/Users/saramoein/Downloads/BCR1/filtered_contig_igblast_db-pass_genotyped.tsv'
vdj = ddl.read_10x_airr(file_location)
But for adata, I used what I had generated form previous cell ranger step, which probably is not correct.
Would you please tell me how to get the adata related to BCR, in order to make sure it works for BCR visualization?
Thanks,
Sara
from dandelion.
Just to complete my last message, I used "adata = sc.read_h5ad('/Users/saramoein/downloads/BCR1/adata2.h5ad')"
to get the adata. But that should not be correct.
from dandelion.
Hmm can you check if the cell barcodes are correct?
the indices in vdj.metadata
and adata.obs
should be the same. if not, the transfer wouold not happen properly.
Can you print out for me what they look like here.
from dandelion.
Thanks. I see that the vdj.metadata indices are strating with BCR (for example BCR_AAACCTGTCACTGGGC), but for adata, they hav no "BCR". For example (AAACCTGCAAAGCAAT-1) . So I should fix this part. But the question is the number of rows for vdj.metadata is 773, but the adata.obs has 13800 rows. Can it be a problem?
from dandelion.
should be fine. the bcr data can be a subset of the adata but the barcodes must exist in the adata object.
from dandelion.
Hi Kelvin,
I have a question, which probably looks basic. But I am trapped and can not solve it. Actually, I am going to replace the 'BCR_" prefix in vdj.metedata with " ".
I am using replace.str function. But it looks the change is not permanent.
That when I replace, it changes. But after new operation on "vdj" again I see the original indexes.
Can I ask your help what command I should use for replacing my indexing: for example "BCR_TTTGGTTGTAGGCATG" replacing with "TTTGGTTGTAGGCATG".
I used vdj.metadata.index= vdj.metadata.index.str.replace('BCR_',''). it works until I have not called new function on vdj.
So it looks my replacing command is not working permanently.
Do you have any idea how to replace the index with other characters?
from dandelion.
Hi Sara,
you should change vdj.data['cell_id']
rather than vdj.metadata.index
. That should solve your problem.
i would do:
temp = vdj.data.copy()
temp['cell_id'] = # replace function here
new_vdj = ddl.Dandelion(temp)
The reason is the .metadata
slot updates from the .data
slot with every function call from dandelion.
What i normally do is to ensure that that my cell barcodes align right at the beginning of the preprocessing. Hence the function ddl.pp.format_fastas
. If you're not starting from the realignment, but from an airr table source somewhere else, then you have to alter the cell_id
column before loading it up in Dandelion
.
from dandelion.
Thanks Kelvin for your response. I used all singularity for preprocessing step. and now need to fix the cell ids in adata and vdj.
Though you did favor and wrote for me the code, but still I am so confused.
Is it vdj['cell_id'] that I should match it with adat['cell_id']?
I am so confused. Because my vdj.data.cell_id are
like "BCR_AAACCTGTCACTGGGC_contig_2 BCR_AAACCTGTCACTGGGC".
my adata.obs.index are like "AAACCTGAGAACAATC-1"
How I should match vdj and adta barcodes? I am really confused and I really appreciate your help. Thank you.
from dandelion.
I think the easiest solution is to change the index names in adata.obs_names
first, then make sure that your vdj.data['cell_id']
matches adata.obs_names
.
i think a simple fix would be to do the following:
# now, in your anndata object:
adata.obs_names = ['BCR_' + x.split('-')[0] for x in adata.obs_names]
# then do what i describe above for the vdj.data['cell_id']
If that doesn't fix it, when you run singularity run -B $PWD /path/to/sc-dandelion_latest.sif dandelion-preprocess
, don't use the --meta
option as the current default behavior is that it will tag with the sample names (folder names) as default, if more than one sample was provided.
from dandelion.
Thank you Kelvin for great helping!
Related to adata for BCR, the current file I have (sample_feature_bc_matrix.h5) is obtained form cell ranger and includes all cell types, including BCR. Considering I am doing BCR analysis, should I first filer sample_feature_bc_matrix.h5 based on BCR cell barcodes?
from dandelion.
I think you will find that the bcr coverage is not as high as the transcriptome data. so i would keep it all in. If there's no further questions to the original question, i will close this issue as the rest of your problems sounds like that are general python/pandas related queries.
from dandelion.
Thanks. Sure, close it.
from dandelion.
Related Issues (20)
- generate network in 0.3.4dev is not working the same way as 0.3.3
- Number of PASS and FAIL in output of the function ddl.pp.reannotate_genes HOT 3
- Singularity Container Preprocessing Error HOT 2
- Bugs in 'pp.check_contigs' HOT 1
- mouse TRA contigs not passing pre-processing HOT 1
- IGDATA vs. IGBLAST_DB Confusion + Other Installation kerfuffles HOT 1
- Incompatibility with latest matplotlib HOT 2
- No such file or directory: 'MakeDb_gentle.py' HOT 3
- Singularity Error: raise ValueError(f"cannot insert {column}, already exists") HOT 2
- Igblastn error for Human and not Mouse HOT 1
- TypeError: clone_overlap() got an unexpected keyword argument 'colorby' HOT 1
- Singularity Container Preprocessing Error from container HOT 1
- error dandelion/utilities/_core.py KeyError: 'trbc*'. HOT 3
- ddl.to_scirpy throws an error: "Value passed for key 'airr' is of incorrect shape" HOT 7
- Error when running ddl.pp.quantify_mutations() HOT 3
- ImportError with latest AnnData HOT 1
- "clone_id" not found
- Remove python 3.9
- allow network downsample to work on any locus
- Remove `nxviz` from external
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dandelion.