comprna / meteore Goto Github PK

View Code? Open in Web Editor NEW

72.0 72.0 19.0 583.88 MB

Automatic DNA methylation detection from nanopore tools and their consensus model

License: MIT License

Python 44.83% R 54.36% Shell 0.81%

meteore's People

Contributors

Stargazers

Watchers

Forkers

sabiqali standardgalactic chenwt miguelafreis zakayuen bin-guan meydaria caojiabao khjia shangshanzhizhe mmmads aminalem fatihlrcfs wangzhennan14 hrrsjeong saimmomin12 eld211

meteore's Issues

Megalodon Example : AssertionError: Alphabet (ACGT) and model number of modified bases (-41) do not agree.

Hi All,
I am using the command below, I got this error. could you offer me how can ı solve that? I have CPU for analysis and working on example data which is provided from meteore. thanks
best.
command:
megalodon data/example/ --outputs mods --reference data/ecoli_k12_mg1655.fasta --mod-motif m CG 0 --write-mods-text --processes 10 --guppy-server-path /cluster/lrcfs/2397405/bin/ont-guppy-cpu-5.0.16/bin/guppy_basecall_server --guppy-params "-d /cluster/lrcfs/2397405/bin/rerio/basecall_models/ --num_callers 10" --guppy-timeout 400 --guppy-config res_dna_r941_min_modbases_5mC_CpG_v001.cfg --overwrite

Error:
[14:47:06] Running Megalodon version 2.3.5
[14:47:06] Loading guppy basecalling backend
[2021-12-02 14:47:08.803475] [0x00002b838f646700] [info] Connecting to server as ''
[2021-12-02 14:47:08.805161] [0x00002b838f646700] [info] Connected to server as ''. Connection id: 7db2bf17-16c4-4ae3-86bd-31537a63d4bf
[14:47:08] Loading reference
******************** WARNING: "mods" output requested, so "per_read_mods" will be added to outputs. ********************
[14:47:09] Loaded model calls canonical alphabet ACGT and modified bases m=5mC (alt to C)
Traceback (most recent call last):
File "/cluster/lrcfs/ftiras/ftiras_env/envs/nanoplot/bin/megalodon", line 11, in
sys.exit(_main())
File "/cluster/lrcfs/ftiras/ftiras_env/envs/nanoplot/lib/python3.8/site-packages/megalodon/main.py", line 726, in _main
megalodon._main(args)
File "/cluster/lrcfs/ftiras/ftiras_env/envs/nanoplot/lib/python3.8/site-packages/megalodon/megalodon.py", line 1775, in _main
args, mods_info = parse_mod_args(
File "/cluster/lrcfs/ftiras/ftiras_env/envs/nanoplot/lib/python3.8/site-packages/megalodon/megalodon.py", line 1448, in parse_mod_args
mods_info = mods.ModInfo(
File "/cluster/lrcfs/ftiras/ftiras_env/envs/nanoplot/lib/python3.8/site-packages/megalodon/mods.py", line 2211, in init
assert (
AssertionError: Alphabet (ACGT) and model number of modified bases (-41) do not agree.

Stuck in nanopolish example

(meteore_nanopolish_env) [poultrylab1@pbsnode01 METEORE]$ snakemake -s Nanopolish nanopolish_results/example_nanopolish-freq-perCG.tsv --cores 10
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 10
Rules claiming more threads will be scaled down.
Conda environments: ignored
Job counts:
        count   jobs
        1       calculate_frequency
        1       call_methylation
        1       index
        1       minimap2
        1       samtools_index
        1       samtools_sort
        1       split_cpgs
        7

[Tue Jul 13 15:39:21 2021]
rule minimap2:
    input: data/ecoli_k12_mg1655.fasta, example.fastq
    output: nanopolish_results/mapped/example.bam
    log: nanopolish_results/mapped/example.log
    jobid: 6
    wildcards: sample=example
    threads: 6

[Tue Jul 13 15:39:21 2021]
rule index:
    input: data/example, example.fastq
    output: example.fastq.index, example.fastq.index.fai, example.fastq.index.gzi, example.fastq.index.readdb
    jobid: 5
    wildcards: sample=example

[readdb] indexing data/example
[readdb] num reads: 50, num reads with path to fast5: 50
[Tue Jul 13 15:39:21 2021]
Finished job 5.
1 of 7 steps (14%) done
[Tue Jul 13 15:43:34 2021]
Error in rule minimap2:
    jobid: 6
    output: nanopolish_results/mapped/example.bam
    log: nanopolish_results/mapped/example.log (check log file(s) for error message)

RuleException:
RemoteDisconnected in line 25 of /storage-04/chicken/ont_methylation/METEORE/Nanopolish:
Remote end closed connection without response
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 2208, in run_wrapper
  File "/storage-04/chicken/ont_methylation/METEORE/Nanopolish", line 25, in __rule_minimap2
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/urllib/request.py", line 222, in urlopen
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/urllib/request.py", line 525, in open
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/urllib/request.py", line 542, in _open
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/urllib/request.py", line 502, in _call_chain
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/urllib/request.py", line 1393, in https_open
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/urllib/request.py", line 1354, in do_open
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/http/client.py", line 1347, in getresponse
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/http/client.py", line 307, in begin
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/http/client.py", line 276, in _read_status
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 551, in _callback
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/concurrent/futures/thread.py", line 57, in run
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 537, in cached_or_run
  File "/storage-01/poultrylab1/yin/software/anaconda3/envs/mamba/envs/meteore_nanopolish_env/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 2239, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /storage-04/chicken/ont_methylation/METEORE/.snakemake/log/2021-07-13T153919.412649.snakemake.log

and my snakemake version is 5.26.1

Problems running the Nanopolish example: libcrypto.so.1.0.0:

Hi, @zakayuen,

I ran into a problem while samtools_sort after running the following line of your Nanopore example:

snakemake -s Nanopolish nanopolish_results/example_nanopolish-freq-perCG.tsv --cores all

Installation actually seems to have worked fine - I tried both, the .yml file and installation using mamba.

(meteore_nanopolish_env) nicolas@nicolas-Precision-7820-Tower:~/Programs/METEORE$ snakemake -s Nanopolish nanopolish_results/example_nanopolish-freq-perCG.tsv --cores all
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 40
Rules claiming more threads will be scaled down.
Conda environments: ignored
Job counts:
count jobs
1 calculate_frequency
1 call_methylation
1 index
1 minimap2
1 samtools_index
1 samtools_sort
1 split_cpgs
7

[Thu Sep 16 10:55:20 2021]
rule minimap2:
input: data/ecoli_k12_mg1655.fasta, example.fastq
output: nanopolish_results/mapped/example.bam
log: nanopolish_results/mapped/example.log
jobid: 6
wildcards: sample=example
threads: 6

[Thu Sep 16 10:55:20 2021]
rule index:
input: data/example, example.fastq
output: example.fastq.index, example.fastq.index.fai, example.fastq.index.gzi, example.fastq.index.readdb
jobid: 5
wildcards: sample=example

[readdb] indexing data/example
[readdb] num reads: 50, num reads with path to fast5: 50
[Thu Sep 16 10:55:20 2021]
Finished job 5.
1 of 7 steps (14%) done
[Thu Sep 16 10:55:21 2021]
Finished job 6.
2 of 7 steps (29%) done

[Thu Sep 16 10:55:21 2021]
rule samtools_sort:
input: nanopolish_results/mapped/example.bam
output: nanopolish_results/mapped/example.sorted.bam
jobid: 3
wildcards: sample=example

samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
[Thu Sep 16 10:55:21 2021]
Error in rule samtools_sort:
jobid: 3
output: nanopolish_results/mapped/example.sorted.bam
shell:
samtools sort -o nanopolish_results/mapped/example.sorted.bam nanopolish_results/mapped/example.bam
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/nicolas/Programs/METEORE/.snakemake/log/2021-09-16T105520.230639.snakemake.log

Best,
Nicolas

Hello,professor！I can't find .fast5 which contained in dataset1&dataset2

Dear,professor！
when i repeat your article,i can't find the file .fast5 which contained in mixture dataset in (ERR1676719 for negative control and ERR1676720 for positive control). Did you just use two run(ERR1676719&ERR1676720)or any other run？

Issue with downloading E.coli Nanopore data

Hi,
I downloaded E.coli nanopore data which is required to generate dataset1 from the European Nucleotide Archive (ENA) under accession PRJEB13021 (ERR1676719 for negative control and ERR1676720 for positive control) using wget command. The files were fully downloaded according to logs from wget.

wget commands :
wget -c --tries=0 ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR167/ERR1676720/ecoli_er2925.pcr_MSssI.r9.timp.061716.tar.gz
wget -c --tries=0 ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR167/ERR1676719/ecoli_er2925.pcr.r9.timp.061716.tar.gz

But while decompressing the downloaded files using tar, it is showing error like 'gzip: stdin: invalid compressed data--format violated' and ending the decompress process.

tar commands :
tar xvzf ecoli_er2925.pcr.r9.timp.061716.tar.gz
tar xvzf ecoli_er2925.pcr_MSssI.r9.timp.061716.tar.gz

This happened with both the files negative as well as positive control.

Also md5 values for original file and downloaded are different ensuring problem with the downloaded file.

Is there any better way to download these files? Or if possible can you provide the file containing fast5 files from dataset1 that you have used for your analysis?

Thank you,
Regards,
Onkar

combination_model_train.py doesn't work !

The requirement.txt with pip install is ok but the script is not ok

python combination_model_train.py -d example_results/deepsignal_results/example_deepsignal-perRead-score.tsv -n example_results/nanopolish_results/example_nanopolish-perRead-score.tsv -g example_results/guppy_results/example_guppy-perRead-score.tsv -c 3 -o output_models
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:30: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
method='lar', copy_X=True, eps=np.finfo(np.float).eps,
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:167: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
method='lar', copy_X=True, eps=np.finfo(np.float).eps,
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:284: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
eps=np.finfo(np.float).eps, copy_Gram=True, verbose=0,
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:862: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
eps=np.finfo(np.float).eps, copy_X=True, fit_path=True,
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:1101: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
eps=np.finfo(np.float).eps, copy_X=True, fit_path=True,
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:1127: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
eps=np.finfo(np.float).eps, positive=False):
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:1362: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
max_n_alphas=1000, n_jobs=None, eps=np.finfo(np.float).eps,
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:1602: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
max_n_alphas=1000, n_jobs=None, eps=np.finfo(np.float).eps,
/home/miniconda3/lib/python3.8/site-packages/sklearn/linear_model/least_angle.py:1738: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
eps=np.finfo(np.float).eps, copy_X=True, positive=False):
/home/miniconda3/lib/python3.8/site-packages/sklearn/decomposition/online_lda.py:29: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
EPS = np.finfo(np.float).eps
/home/miniconda3/lib/python3.8/site-packages/sklearn/ensemble/gradient_boosting.py:32: DeprecationWarning: np.bool is a deprecated alias for the builtin bool. To silence this warning, use bool by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.bool_ here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
from ._gradient_boosting import predict_stages
/home/miniconda3/lib/python3.8/site-packages/sklearn/ensemble/gradient_boosting.py:32: DeprecationWarning: np.bool is a deprecated alias for the builtin bool. To silence this warning, use bool by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.bool_ here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
from ._gradient_boosting import predict_stages
combination_model_train.py:54: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
data_file["Pos"][mask]=data_file["Pos"][mask]-1
Traceback (most recent call last):
File "combination_model_train.py", line 96, in
X,y=combine_methods(val)
File "combination_model_train.py", line 64, in combine_methods
combine_file=reduce(lambda left,right: pd.merge(left, right, how='inner',on=["ID","Pos","Label"]), dfs)
File "combination_model_train.py", line 64, in
combine_file=reduce(lambda left,right: pd.merge(left, right, how='inner',on=["ID","Pos","Label"]), dfs)
File "/home/miniconda3/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 74, in merge
op = _MergeOperation(
File "/home/miniconda3/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 668, in init
) = self._get_merge_keys()
File "/home/miniconda3/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 1033, in _get_merge_keys
right_keys.append(right._get_label_or_level_values(rk))
File "/home/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 1684, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'Label'

The output folder is not created. Any ideas please ? I would like really to test this pipeline on real data and I stay blocked with your "toy" example :-(
With the other script, combine_prediction.py is the same thing.

Thanks for your time,

weighted vs. unweighted average

Hi,

I have a question on how you calculate the mean after accumulating the CpG sites, for example in the file run_megalodon.R:
df <- df[,list(Methyl_freq = mean(Methyl_freq), Cov = sum(Cov)), list(Chr,Pos)]
Can you maybe explain why you use just the mean without weighting it based on the amount of reads that support the different methylation frequencies? I think I would rather use the weighted mean (giving more weight to a methylation frequency which is supported by a higher coverage)? Like in the following example:
df <- df[,list(Methyl_freq = sum(Methyl_freq*Cov)/sum(Cov), Cov = sum(Cov)), list(Chr,Pos)]
Or am I missing some reason for which it is better to use the unweighted mean?

[BUG] Incorrect score in script/format_megalodon.R

Hi, thank you for this great tool!

I think I noticed a tiny mistake in the format_megalodon.R script.

In line 15, the df is reduced to six columns. In line 16, a new column "Score" is introduced -> the 7th column.
However, in line 17, columns 1,2,3,4, and 6 are kept and in line 18 the sixth column ist renamed to "Score" while it is in fact the "can_log_prob"

df <- df %>% select(1,2,4,3,5,6)
df$Score <- df$mod_log_prob - df$can_log_prob
df <- df %>% select(1,2,3,4,6)
colnames(df) <- c("ID", "Chr", "Pos", "Strand", "Score")

This therefore just renames the can_log_prob to score instead of using the calculated score, right?

combination model

What is the interest of use an trained combination model ? What is the difference or benefit in comparison with your combination model (already trained) ?
And the same question between default model or optimized model for the combination ? I searched in your documentation but I didn't find any explanations.

Thanks,

Issue with downloading E.coli Nanopore data

Hi,Professor!
I downloaded E.coli nanopore data from the European Nucleotide Archive (ENA) under accession PRJEB13021 (ERR1676719 for negative control and ERR1676720 for positive control) using Xunlei. The files were fully downloaded .
But while decompressing the downloaded files using tar, it is showing error like 'gzip: stdin: invalid compressed data--format violated' and tar: Skipping to next header and tar: Exiting with failure status due to previous errors.
tar commands :
tar -xvzf ecoli_er2925.pcr.r9.timp.061716.tar.gz
tar -xvzf ecoli_er2925.pcr_MSssI.r9.timp.061716.tar.gz

This happened with both the files negative as well as positive control.

Is there any better way to download these files?Or can you give me some suggestions to solve it ?

combination_model_train.py Label and Chr fields

Hi,

Just wanted to see if it would be possible to update combination_model_train.py to include Chr. Also, I've tried to run it but got errors related to a Label field. The readme says training can be done on per-read and per-site output from tools, but that output doesn't include Label, so I'm confused about what I should be using as input. I'm working on plants and would like to see how results compare if I train using random forest compared to the provided models.

Thanks!

Error in rule calculate_frequency some suggestions

Hi,

I encountered the error in calculate_frequency with R 4.0 version when running the demo, specifically a missing value error for the dataframe. There were two places I needed to correct to make the demo run. I would suggest two small edits in the run_nanopolish.R script. The first place is df <- read.table(args[1], header = TRUE, sep="\t", stringsAsFactors = TRUE), if not setting the stringsAsFactors to True, the df$Chr <- as.character(levels(df$Chr))[df$Chr] and
df$Strand <- as.character(levels(df$Strand))[df$Strand] would return missing values for the columns. The second issue is df_m <-count(df[df$Log.like.ratio > 0, ], c("Chr", "Pos","Strand")) and
df_unm <- count(df[df$Log.like.ratio < 0, ], c("Chr", "Pos","Strand")), need to use name space plyr::count() for the correct count function.

Thanks,

module loading error - combination_model_prediction.py

I wonder if this error is related with the python version when the saved models were created.

$python3 combination_model_prediction.py -i files_in -m default -o results/combined_default_model.tsv

combination_model_prediction.py:36: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
data_file["Pos"][mask]=data_file["Pos"][mask]-1
Traceback (most recent call last):
File "combination_model_prediction.py", line 77, in
main(mp,combine_file)
File "combination_model_prediction.py", line 44, in main
loaded_model = joblib.load(open(mp, 'rb'))
File "/usr/local/lib/python3.8/dist-packages/joblib/numpy_pickle.py", line 575, in load
obj = _unpickle(fobj)
File "/usr/local/lib/python3.8/dist-packages/joblib/numpy_pickle.py", line 504, in _unpickle
obj = unpickler.load()
File "/usr/lib/python3.8/pickle.py", line 1210, in load
dispatchkey[0]
File "/usr/lib/python3.8/pickle.py", line 1526, in load_global
klass = self.find_class(module, name)
File "/usr/lib/python3.8/pickle.py", line 1577, in find_class
import(module, level=0)
ModuleNotFoundError: No module named 'sklearn.ensemble.forest'

tombo snakemake

Hi,

I have some problem with tombo.
I did:

Create an environment with Snakemake installed

mamba create -c conda-forge -c bioconda -n meteore_tombo_env snakemake

Activate

conda activate meteore_tombo_env

Install all required packages using pip

pip install ont-tombo wiggelen ont-fast5-api

but pip install ont-tombo give many error : the last
ERROR: Command errored out with exit status 1: /home/delphine.naquin/miniconda3/envs/meteore_tombo_env/bin/python3.9 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-yhtty2py/mappy_7e9f9ec6bb6b4fac86f9c74eea7e444c/setup.py'"'"'; file='"'"'/tmp/pip-install-yhtty2py/mappy_7e9f9ec6bb6b4fac86f9c74eea7e444c/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-zx4rt7nt/install-record.txt --single-version-externally-managed --compile --install-headers /home/xx/miniconda3/envs/meteore_tombo_env/include/python3.9/mappy Check the logs for full command output.

and if I launch snakemake -s Tombo tombo_results/example_tombo-freq-perCG.tsv --cores all
of course, I obtain :
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 40
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 combine_freq_and_cov
1 detect_modification
1 output_browser_files
1 wig_to_tsv_for_cov
1 wig_to_tsv_for_mod
5
Select jobs to execute...

[Wed Feb 17 14:10:47 2021]
rule detect_modification:
input: data/example, data/ecoli_k12_mg1655.fasta
output: example.CpG.tombo.stats, example.CpG.tombo.per_read_stats
jobid: 3
wildcards: sample=example

/bin/bash: tombo: command not found
...

Can you help me please ? It works for snakemake Nanopolish and deepsignal...
Thanks,

mamba env create guppy doesn't work

I have some problem with this command, can you help me again ?

mamba env create -f guppy.yml

Encountered problems while solving.
Problem: package r-plyr-1.8.6-r36h0357c0b_1 requires r-base >=3.6,<3.7.0a0, but none of the providers can be installed
Problem: package r-dplyr-1.0.2-r36h0357c0b_0 requires r-base >=3.6,<3.7.0a0, but none of the providers can be installed
Problem: package r-data.table-1.13.0-r36h0eb13af_0 requires r-base >=3.6,<3.7.0a0, but none of the providers can be installed

Thanks

per-site prediction from combined model format

Hello,

Is there an easy way to convert the per-site prediction from the combined model to a bedgraph? I am a bit lost since the positions are over the entire genome rather than chromosomal coordinates.
Thank you.

How to use `Combined model (multiple linear regression)`

Hi, @zakayuen,

After reading your artical, it seams REG is the best combined model. So I choose to run it to combine example results. But I can't understand the doc https://github.com/comprna/METEORE#combined-model-multiple-linear-regression-usage.

The example usage below would train the REG model on Nanopolish and DeepSignal outputs from the "mix1" data set, and apply this model to the data in "mix2" for the same tools. Inputs and outputs rely on the ordering of arguments to remain consistent.

python meteore_reg.py --train set1_nanopolish.tsv set1_deepsignal.tsv \
      --tool_names nanopolish deepsignal --train_desc mix1 --test_desc mix2 \
      --test set2_nanopolish.tsv set2_deepsignal.tsv --testIsTraining \
      --trunc_min -50 -5 --trunc_max 50 5 --filter 0.1

Is that means I have to train a new model based on ground truth data? Or whether I can use the training data alreay generated by your team?

What is the format for the --train and -- test? May I use example_nanopolish-perRead-score.tsv and example_deepsignal-perRead-score.tsv as --train and -- test?

Error in Nanopolish Snakemake

Hi,

While trying to run the Nanopolish snakemake with the example data I get the following error:

RuleException:
WorkflowError in line 25 of /METEORE/Nanopolish:
HTTPError: HTTP Error 404: Not Found
File "/METEORE/Nanopolish", line 25, in __rule_minimap2
File "/services/tools/anaconda3/4.4.0/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

The problem is with the wrapper: "0.66.0/bio/minimap2/aligner"

I am using the 0.14.0. version of Nanopolish.

How could I solve this?

Thank you in advance

Question: Can this pipeline be used for bacterial methylation detectin.

Hello,

In E.coli Dcm methyltransferases methylate the C5 position of the second cytosine in the sequences CCAGG and CCTGG. I was wondering if I could use your tool for detecting this in bacteria or is it somehow restricted to the CpG sites in eukaryotes?

Thank you!
Martina

Nanopolish snakemake pipeline installation error

Hi,
I followed the instructions for installation of Miniconda3 on Linux, then for the installation of Nanopolish snakemake piepline. But at the last instruction:
mamba install -c bioconda nanopolish samtools r-data.table r-dplyr r-plyr
I get this error:

Looking for: ['nanopolish', 'samtools', 'r-data.table', 'r-dplyr', 'r-plyr', 'numpy']

bioconda/linux-64        Using cache
bioconda/noarch          Using cache
pkgs/r/noarch            [====================] (00m:00s) No change
pkgs/main/noarch         [====================] (00m:00s) No change
pkgs/r/linux-64          [====================] (00m:00s) No change
pkgs/main/linux-64       [====================] (00m:00s) No change
Encountered problems while solving.
Problem: nothing provides numpy 1.10* needed by biopython-1.66-np110py27_0

Do you know how to solve this?

Thank you!

Fail with Nanopolish pipeline

Hi, I already had run the nanopolish pipeline successfully. Today I want to try it again, and I had removed the output files which from last test. But I failed in the step of minimap2:

`(meteore_nanopolish_env) root@iZbp1g75dkmfozg1lmq072Z:~/lyl/METEORE# snakemake -s Nanopolish nanopolish_results/example_nanopolish-freq-perCG.tsv --cores all
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Conda environments: ignored
Job counts:
count jobs
1 calculate_frequency
1 call_methylation
1 index
1 minimap2
1 samtools_index
1 samtools_sort
1 split_cpgs
7
Select jobs to execute...

[Mon Dec 14 19:33:14 2020]
rule index:
input: data/example, example.fastq
output: example.fastq.index, example.fastq.index.fai, example.fastq.index.gzi, example.fastq.index.readdb
jobid: 6
wildcards: sample=example

[readdb] indexing data/example
[readdb] num reads: 50, num reads with path to fast5: 50
[Mon Dec 14 19:33:14 2020]
Finished job 6.
1 of 7 steps (14%) done
Select jobs to execute...

[Mon Dec 14 19:33:14 2020]
rule minimap2:
input: data/ecoli_k12_mg1655.fasta, example.fastq
output: nanopolish_results/mapped/example.bam
log: nanopolish_results/mapped/example.log
jobid: 4
wildcards: sample=example

[Mon Dec 14 19:33:23 2020]
Error in rule minimap2:
jobid: 4
output: nanopolish_results/mapped/example.bam
log: nanopolish_results/mapped/example.log (check log file(s) for error message)

RuleException:
URLError in line 25 of /root/lyl/METEORE/Nanopolish:
<urlopen error [Errno 111] Connection refused>
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/site-packages/snakemake/executors/init.py", line 2318, in run_wrapper
File "/root/lyl/METEORE/Nanopolish", line 25, in __rule_minimap2
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 214, in urlopen
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 523, in open
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 632, in http_response
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 555, in error
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 494, in _call_chain
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 747, in http_error_302
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 517, in open
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 534, in _open
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 494, in _call_chain
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 1385, in https_open
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/urllib/request.py", line 1345, in do_open
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/site-packages/snakemake/executors/init.py", line 560, in _callback
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/concurrent/futures/thread.py", line 52, in run
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/site-packages/snakemake/executors/init.py", line 546, in cached_or_run
File "/root/anaconda3/envs/meteore_nanopolish_env/lib/python3.9/site-packages/snakemake/executors/init.py", line 2349, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /root/lyl/METEORE/.snakemake/log/2020-12-14T193314.025215.snakemake.log
`
I have no idea how to deal with this. Thanks in advance!!

Can Nanopolish snakemake call cpggpc pattern?

Hi, I noticed that Nanopolish snakemake code.
snakemake -s Nanopolish nanopolish_results/example_nanopolish-freq-perCG.tsv --cores all
Can I choose the pattern (CpG/GpC/cpggpc) in this snakemake ,or it only work for calling CpG?

Best wishes,
Kirtio

Problem running DeepMod on exmaple data!!!

Hi,
I am running DeepMod-0.1.3 on example data provided. But it is showing error and giving no output.
Following is the log file for deepmod run :

Nanopore sequencing data analysis is resourece-intensive and time consuming.
Some potential strong recommendations are below:
If your reference genome is large as human genome and your Nanopore data is huge,
It would be faster to run this program parallelly to speed up.
You might run different input folders of your fast5 files and
give different output names (--FileID) or folders (--outFolder)
A good way for this is to run different chromosome individually.

         Current directory: example_data/METEORE-1.0.0/data
                  outLevel: 2
                   wrkBase: example
                    FileID: ecoli_deepmod
                 outFolder: ecoli_deepmod/
                 recursive: 1
          files_per_thread: 1000
                   threads: 15
                windowsize: 21
                  alignStr: minimap2
               SignalGroup: simple
                      move: False
               basecall_1d: Basecall_1D_000
          basecall_2strand: BaseCalled_template
                    ConUnk: True
               outputlayer: 
                      Base: C
               mod_cluster: 0
                   predDet: 1
                       Ref: ecoli_k12_mg1655.fasta
                      fnum: 7
                    hidden: 100
                   modfile: DeepMod-0.1.3/train_deepmod/rnn_sinmodC_P100wd21_f7ne1u0_4/mod_train_sinmodC_P100wd21_f3ne1u0
                    region: [[None, None, None]]

Instructions for updating:
Use standard file APIs to check for files with this prefix.
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch114_read6906_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch294_read732_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch177_read479_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch279_read13169_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch354_read2006_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch230_read20744_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch172_read1102_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch158_read435_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch262_read814_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch330_read2145_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch324_read570_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch217_read1495_strand1.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch258_read2664_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch294_read356_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch118_read2334_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch412_read8784_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch269_read2204_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch433_read26717_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch118_read526_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch381_read12341_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch298_read2348_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch88_read4943_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch252_read2417_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch366_read1770_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch159_read630_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch324_read1332_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch313_read2657_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch241_read2385_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch434_read696_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch139_read4507_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch175_read80_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch436_read2992_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch311_read1440_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch430_read511_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch422_read940_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch375_read1018_strand1.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch270_read545_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch193_read10923_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch431_read800_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch224_read1419_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch409_read716_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch418_read3292_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch138_read698_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch228_read1212_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch162_read1065_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch142_read2358_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch291_read161080_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch443_read162_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch375_read1189_strand.fast5
Error!!! No events data in example/kelvin_20160617_FN_MN17519_sequencing_run_sample_id_74930_ch205_read3886_strand.fast5
[M::mm_idx_gen::0.1281.18] collected minimizers
[M::mm_idx_gen::0.1591.52] sorted minimizers
[M::main::0.1591.52] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.1811.46] mid_occ = 12
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.190*1.44] distinct minimizers: 838542 (98.18% are singletons); average occurrences: 1.034; average spacing: 5.352
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -ax map-ont ecoli_k12_mg1655.fasta /tmp/tmp19jlkpxt.fa
[M::main] Real time: 0.194 sec; CPU: 0.277 sec; Peak RSS: 0.382 GB
Cur Prediction consuming time 1 for 0 0
Error information for different fast5 files:
No events data 50
Per-read Prediction consuming time 8
Find: ecoli_deepmod//ecoli_deepmod 0 rnn.pred.ind
[]
Genomic-position Detection consuming time 0

I checked fast5 files using h5ls -r command. fastq and event dataset can be found in the /Analyses/Basecall_1D_000/BaseCalled_template/ group. but deepmod is not able to find it. Can you comment whats happening here?Is this fast5 file problem or am I missing some step?

Thank you

Regards
Onkar

Guppy snakemake rocksdb error

Hi, first of all thanks for share your work!

I'm trying to run the snakemake guppy example without success.

This is the command I executed and I'm getting an error related to rocksdb:
snakemake -s Guppy guppy_results/example_guppy-freq-perCG.tsv

ImportError: /opt/anaconda/envs/guppy_cpg_snakemake/lib/python3.6/site-packages/rocksdb/_rocksdb.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZNK7rocksdb24AssociativeMergeOperator12PartialMergeERKNS_5SliceES3_S3_PNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPNS_6LoggerE

I saw a similar comment in other issue #5 so I installed from the source different version (5.3.6, 5.11.3, 7.3.1, 7.4.0), but even so i was't lucky. Maybe the problem is here, in how i am installing rocksdb.

This is the message when i delete rockdb dependency in conda env :

Traceback (most recent call last):
  File "guppy_results/gcf52ref/scripts/extract_methylation_fast5.py", line 180, in <module>
    mdb = MethylDB(args.mod_data)
  File "guppy_results/gcf52ref/scripts/extract_methylation_fast5.py", line 39, in __init__
    import rocksdb
  File "/opt/anaconda/envs/guppy_cpg_snakemake/lib/python3.6/site-packages/rocksdb/__init__.py", line 1, in <module>
    from ._rocksdb import *
ImportError: librocksdb.so.5.3: cannot open shared object file: No such file or directory

Error in rule samtools_index_and_extract_methylation_from_fast5:
    jobid: 3
    output: guppy_results/mapped/example.bam.bai

RuleException:
CalledProcessError in line 17 of METEORE/Guppy:
Command ' set -euo pipefail;  samtools index guppy_results/mapped/example.bam && python guppy_results/gcf52ref/scripts/extract_methylation_fast5.py -p 10 guppy_results/workspace/*.fast5 ' returned non-zero exit status 1.
  File "METEORE/Guppy", line 17, in __rule_samtools_index_and_extract_methylation_from_fast5
  File "/opt/anaconda/envs/guppy_cpg_snakemake/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Removing output files of failed job samtools_index_and_extract_methylation_from_fast5 since they might be corrupted:
guppy_results/mapped/example.bam.bai

I tried in two different computers
Ubuntu 20.04.4 LTS
Guppy CPU Version 6.1.3

Ubuntu 22.04
Guppy GPU Version 6.1.7

I think there is something related to anaconda and linked lib, I tried to re-install everything several times, change LD_LIBRARY_PATH, use conda package, source package...

I'm still trying, but any hints or ideas are more than welcome.

Thanks in advance

Deepsignal1 snakemake file - error with example data

Hello,

Thanks for creating this handy pipeline! I tried running the software with your example data and it ran fine with nanopolish but I am encountering an error when I try running the Deeepsignal1. I have created the environment following your instructions and the software installed fine. I also downloaed and extracted the model files in the data directory. However, I get an error:

`
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 calculate_frequency
1 call_modification
2

[Wed May 4 15:38:22 2022]
rule call_modification:
input: deepsignal_results/example_deepsignal-feature.tsv
output: deepsignal_results/example_deepsignal-prob.tsv
jobid: 1
wildcards: sample=example

Traceback (most recent call last):
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/bin/deepsignal", line 8, in
sys.exit(main())
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/deepsignal/deepsignal.py", line 424, in main
args.func(args)
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/deepsignal/deepsignal.py", line 46, in main_call_mods
from .call_modifications import call_mods
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/deepsignal/call_modifications.py", line 10, in
import tensorflow as tf
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/tensorflow/init.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/tensorflow/python/init.py", line 52, in
from tensorflow.core.framework.graph_pb2 import *
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/tensorflow/core/framework/graph_pb2.py", line 6, in
from google.protobuf import descriptor as _descriptor
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/google/protobuf/descriptor.py", line 47, in
from google.protobuf.pyext import _message
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/site-packages/google/protobuf/pyext/_message.cpython-36m-x86_64-linux-gnu.so)
[Wed May 4 15:38:23 2022]
Error in rule call_modification:
jobid: 1
output: deepsignal_results/example_deepsignal-prob.tsv

RuleException:
CalledProcessError in line 16 of /mnt/lustre/groups/biol-ralsto-2019/SOFTWARE/METEORE/Deepsignal1:
Command ' set -euo pipefail; deepsignal call_mods --input_path deepsignal_results/example_deepsignal-feature.tsv --is_gpu no --nproc 10 --model_path data/model.CpG.R9.4_1D.human_hx1.bn17.sn360.v0.1.7+/bn_17.sn_360.epoch_9.ckpt --result_file deepsignal_results/example_deepsignal-prob.tsv ' returned non-zero exit status 1.
File "/mnt/lustre/groups/biol-ralsto-2019/SOFTWARE/METEORE/Deepsignal1", line 16, in __rule_call_modification
File "/users/mms565/scratch/conda/envs/meteore_deepsignal_env1/lib/python3.6/concurrent/futures/thread.py", line 55, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/lustre/groups/biol-ralsto-2019/SOFTWARE/METEORE/.snakemake/log/2022-05-04T153822.867032.snakemake.log
`

It's just those two runs. You'll have to uncompress the .tar.gz file.

Originally posted by @cameron-jack in #16 (comment)

Module Not Found Error: import in call_modification_frequency.py

I am currently trying to run your 'call_modification_frequency.py' script on deepsignal data I have. This is the error message it produces:

Traceback (most recent call last):
  File "call_modification_frequency.py", line 10, in <module>
    from txt_formater import ModRecord
ModuleNotFoundError: No module named 'txt_formater'

I cannot find any module with this name online.

Thanks in advance!

segmentation fault while runninc the combination_model_prediction (megalodon - nanopolish)

Hey,
I have only recently I found Meteore. I have run separately megalodon and nanopolish, but I used the scripts from meteore cu generate perRead-score.tsv for megalodon and nanopolish. The tsv files are big (69G and 31G, respectively). I have tried using only the common reads called by megalodon and nanopolish but still it failed.
Do you think I should split the input files or do you have a better suggestion?

best,
Ligia

guppy snakemake rocksdb error

Hi,

Thanks for developing this pipeline.

I encountered error running snakemake -s Guppy guppy_results/example_guppy-freq-perCG.tsv

It seems an error from importing rocksdb, it says undefined symbol

I installed rocksdb from the source using cmake and make, make install INSTALL_PATH=/usr outside conda environment.

Another request is if it is possible to indicate conda version, python version, R version, etc, all requisites in each workflow? I basically encounter issues in each one just following the tutorial even in the conda environment. It has been a nightmare to debug for non advanced user like me.

Thank you,

comprna / meteore Goto Github PK

meteore's People

Contributors

Stargazers

Watchers

Forkers

meteore's Issues

Create an environment with Snakemake installed

Activate

Install all required packages using pip

Recommend Projects

Recommend Topics

Recommend Org