Hi Kelvin, Thanks for your helps. I am running the dandelion pipelin

hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

ImportError: Unable to initialise R instance. Please run this separately through R with Shazam's tutorial. about dandelion HOT 25 CLOSED

zktuong commented on September 17, 2024

ImportError: Unable to initialise R instance. Please run this separately through R with Shazam's tutorial.

from dandelion.

Comments (25)

zktuong commented on September 17, 2024

hi @saramoein372, that error indicates you may have an incompatible rpy2/r setup - it's trying to open R twice or it cannot find R?.

I'm not sure what's the best way for you to sort that out but the work around for now is to save the airr file out:

# in python
# write out the airr file
vdj.write_airr('airr.tsv')

and then you load up airr.tsv in R/Rstudio separately, and follow the tutorial in: https://shazam.readthedocs.io/en/stable/vignettes/DistToNearest-Vignette/

when you eventually get to the steps with says:

# in R
# Find threshold using density method
output <- findThreshold(dist_ham$dist_nearest, method="density")
threshold <- output@threshold
threshold

Note down the threshold value is what you supply to the vdj object:

vdj.threshold = threshold_value

Note that the threshold value is only used for ddl.tl.define_clones function, which is only a partial wrapper for the actual DefineClones.py script used in changeo. So you could follow changeo's tutorial on how they define clones, and then feed the processed file back to dandelion for the other bits.

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin.

I also have another question about
sc.set_figure_params(figsize = [4,4])
ddl.pl.clone_network(adata,
color = ['sampleid'],
edges_width = 1,
size = 20)

I get an error:
KeyError: "Could not find entry in obsm for 'vdj'.\nAvailable keys are: ['X_pca', 'X_umap']."

What is the solution for this error? Thank you.

from dandelion.

zktuong commented on September 17, 2024

KeyError: "Could not find entry in obsm for 'vdj'.\nAvailable keys are: ['X_pca', 'X_umap']."

You need to run ddl.tl.generate_network(vdj) and ddl.tl.transfer(adata, vdj) first.

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin.

Sorry if this is not a related question; however I have many troubles for Shazam installation on R. In different ways I tried to fix my installation, but still could not solve it.
The last error I am getting is:
...............................................
library(shazam)
Error: package or namespace load failed for ‘shazam’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘GenomicAlignments’
................................................
Do you have any thoughts or comments in this regard? I know it is not related to dandelion package, but thought maybe you had experienced the same troubles with /shazam" installation.

Thanks,
Sara

from dandelion.

zktuong commented on September 17, 2024

no worries. You need to looks like you need to do BiocManager::install('GenomicAlignments') for this.

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin for your helps. Still getting error. Since morning I am installing this package!
Again error:
library(shazam)
Error: package or namespace load failed for ‘shazam’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
there is no package called ‘GenomicAlignments’

May I ask what R version are you using?

Thanks,
Sara

from dandelion.

zktuong commented on September 17, 2024

I use R 4.0.3. You might need to update your R version if you using <4

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin.

I have another question: after assigning vdj.threshold = 0.27, can I immediately run "ddl.tl.generate_network(vdj)"?
I tried running that but I get the error "ValueError: key sequence_alignment_aa not found in input table."

Is it because I have not run these commands first "ddl.pp.calculate_threshold(vdj, manual_threshold = 0.1)
ddl.tl.define_clones(vdj, key_added = 'changeo_clone_id')"?

from dandelion.

zktuong commented on September 17, 2024

Is it because I have not run these commands first "ddl.pp.calculate_threshold(vdj, manual_threshold = 0.1)
ddl.tl.define_clones(vdj, key_added = 'changeo_clone_id')"?

Partly yes. For ddl.tl.generate_network to work, it needs the clone_id and a couple of other columns in the object. You can refer to https://sc-dandelion.readthedocs.io/en/latest/modules/dandelion.tools.generate_network.html for more details.

I tried running that but I get the error "ValueError: key sequence_alignment_aa not found in input table."

For this, did you go through the entire preprocess workflow outline in the tutorial? Areyou analyzing TCR or BCR? There's a bug in the version (0.1.9) on pypi in that the aa columns were not included after the TCR realignment step but this is corrected in the master version here on github. If you reinstall the master version, or use the singularity container to handle the preprocessing, that should hopefully get rid of the issue. Alternatively, you can supply anotherkey to the function, like sequence_alignment, sequence, junction etc.

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin.
I have BCR. I did the preprocessing. I have the filtered_contig_igblast_db-pass_genotyped.tsv.
Then from which step I should use this file for my analysis? I think starting from some steps, I am using the wrong tsv file.
Should I use filtered_contig_igblast_db-pass_genotyped.tsv for all my analysis instead of "airr_rearrangement.tsv"?

Thanks,
Sara

from dandelion.

zktuong commented on September 17, 2024

yes use filtered_contig_igblast_db-pass_genotyped.tsv.

airr_rearrangement.tsv is from 10x's cellranger output and by default it's missing a lot of the columns that dandelion uses as default. it's still usable but would require changing the default options. you can look at the API page for more information: https://sc-dandelion.readthedocs.io/en/latest/api.html

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kevin.

I could pass that part. Currently, I am running the Interoperability with scirpy.

Now I am blocked at this step:
irdatax = ddl.to_scirpy(vdj, transfer = True)
irdatax

Getting error:

ValueError: edges=True requires pp.neighbors to be run before.

Do you have any idea how to solve it?

I really appreciate your time.
Sara

from dandelion.

zktuong commented on September 17, 2024

ValueError: edges=True requires pp.neighbors to be run before.

Something like sc.pp.neighbors(adata) should work.

from dandelion.

saramoein372 commented on September 17, 2024

Thank you so much Kelvin. I could run most part of the pipeline in the BCR visualization step. The problem is my clone network plots are all empty!
I think my adata has problem.
For vdj, i used:
file_location = '/Users/saramoein/Downloads/BCR1/filtered_contig_igblast_db-pass_genotyped.tsv'
vdj = ddl.read_10x_airr(file_location)

But for adata, I used what I had generated form previous cell ranger step, which probably is not correct.
Would you please tell me how to get the adata related to BCR, in order to make sure it works for BCR visualization?

Thanks,
Sara

from dandelion.

saramoein372 commented on September 17, 2024

Just to complete my last message, I used "adata = sc.read_h5ad('/Users/saramoein/downloads/BCR1/adata2.h5ad')"
to get the adata. But that should not be correct.

from dandelion.

zktuong commented on September 17, 2024

Hmm can you check if the cell barcodes are correct?

the indices in vdj.metadata and adata.obs should be the same. if not, the transfer wouold not happen properly.
Can you print out for me what they look like here.

from dandelion.

saramoein372 commented on September 17, 2024

Thanks. I see that the vdj.metadata indices are strating with BCR (for example BCR_AAACCTGTCACTGGGC), but for adata, they hav no "BCR". For example (AAACCTGCAAAGCAAT-1) . So I should fix this part. But the question is the number of rows for vdj.metadata is 773, but the adata.obs has 13800 rows. Can it be a problem?

from dandelion.

zktuong commented on September 17, 2024

should be fine. the bcr data can be a subset of the adata but the barcodes must exist in the adata object.

from dandelion.

saramoein372 commented on September 17, 2024

Hi Kelvin,

I have a question, which probably looks basic. But I am trapped and can not solve it. Actually, I am going to replace the 'BCR_" prefix in vdj.metedata with " ".
I am using replace.str function. But it looks the change is not permanent.

That when I replace, it changes. But after new operation on "vdj" again I see the original indexes.

Can I ask your help what command I should use for replacing my indexing: for example "BCR_TTTGGTTGTAGGCATG" replacing with "TTTGGTTGTAGGCATG".

I used vdj.metadata.index= vdj.metadata.index.str.replace('BCR_',''). it works until I have not called new function on vdj.

So it looks my replacing command is not working permanently.

Do you have any idea how to replace the index with other characters?

from dandelion.

zktuong commented on September 17, 2024

Hi Sara,

you should change vdj.data['cell_id'] rather than vdj.metadata.index. That should solve your problem.

i would do:

temp = vdj.data.copy()
temp['cell_id'] = # replace function here
new_vdj = ddl.Dandelion(temp)

The reason is the .metadata slot updates from the .data slot with every function call from dandelion.

What i normally do is to ensure that that my cell barcodes align right at the beginning of the preprocessing. Hence the function ddl.pp.format_fastas. If you're not starting from the realignment, but from an airr table source somewhere else, then you have to alter the cell_id column before loading it up in Dandelion.

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin for your response. I used all singularity for preprocessing step. and now need to fix the cell ids in adata and vdj.
Though you did favor and wrote for me the code, but still I am so confused.

Is it vdj['cell_id'] that I should match it with adat['cell_id']?

I am so confused. Because my vdj.data.cell_id are
like "BCR_AAACCTGTCACTGGGC_contig_2 BCR_AAACCTGTCACTGGGC".

my adata.obs.index are like "AAACCTGAGAACAATC-1"

How I should match vdj and adta barcodes? I am really confused and I really appreciate your help. Thank you.

from dandelion.

zktuong commented on September 17, 2024

I think the easiest solution is to change the index names in adata.obs_names first, then make sure that your vdj.data['cell_id'] matches adata.obs_names.

i think a simple fix would be to do the following:

# now, in your anndata object:
adata.obs_names = ['BCR_' + x.split('-')[0] for x in adata.obs_names]
# then do what i describe above for the vdj.data['cell_id']

If that doesn't fix it, when you run singularity run -B $PWD /path/to/sc-dandelion_latest.sif dandelion-preprocess, don't use the --meta option as the current default behavior is that it will tag with the sample names (folder names) as default, if more than one sample was provided.

from dandelion.

saramoein372 commented on September 17, 2024

Thank you Kelvin for great helping!
Related to adata for BCR, the current file I have (sample_feature_bc_matrix.h5) is obtained form cell ranger and includes all cell types, including BCR. Considering I am doing BCR analysis, should I first filer sample_feature_bc_matrix.h5 based on BCR cell barcodes?

from dandelion.

zktuong commented on September 17, 2024

I think you will find that the bcr coverage is not as high as the transcriptome data. so i would keep it all in. If there's no further questions to the original question, i will close this issue as the rest of your problems sounds like that are general python/pandas related queries.

from dandelion.

saramoein372 commented on September 17, 2024

Thanks. Sure, close it.

from dandelion.

ImportError: Unable to initialise R instance. Please run this separately through R with Shazam's tutorial. about dandelion HOT 25 CLOSED

Comments (25)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent