artic-network / artic-ncov2019 Goto Github PK

View Code? Open in Web Editor NEW

168.0 32.0 166.0 9.68 MB

ARTIC nanopore protocol for nCoV2019 novel coronavirus

License: Creative Commons Attribution 4.0 International

Shell 1.72% Python 98.28%

artic-ncov2019's Introduction

artic-ncov2019

Initial implementation of an ARTIC bioinformatics platform for nanopore sequencing of nCoV2019 novel coronavirus.

artic-ncov2019's People

Contributors

Stargazers

Watchers

Forkers

taliveith bignianngs george-githinji alexanderdilthey peflanag scatterbrain75 aineniamh skerker dkj star-ops garfinjm tseemann lizramirezcar hivlab tdalpert contrivancecompanychicago irunfasterthanmycode bjohnnyd thidathip pcr1120 jbeaulaurier ashdangerbyrne mrolm jyothi8888 pclangdo123 gmisinzo utkiwi awbivins cefrancom will-rowe ababaian m-bull yavarian hpawestri tgolubch rajneeshsrivastava mi-koch elocampana crimsontyphoon1727 zhenliuxplr animesh jts ahmedmagds rdeborja srooke russcd timeliaf enovoa oicr-gsi vidaahyongczb lapone duanjunhyq sarahreiling drdjbaker igor-stevanovski cruzfernandocarrera fabiogentilini ctr26 bccdc-phl xavier-j armandbester katek minas26902 akulbahl beuret ziels m-a-martin mdtorohernando lksmithak jossetla space-he cmkobel kevinlibuit laurabashor malthomas jgrimsby arko93 jadziaa baksso bveer dooniabajovic mohammed2003w victormaricato damlabresources adnicolasora ms2950 gkarthik fbbjbb yezi0721 ptskidmore pastvir alexaphpviro gclemd ankeetkumar emyliuxe beleafs hjeffery michellejlin ryan-wang-jnu joshquick

artic-ncov2019's Issues

Oligo length in primer pool

IDT DNA company makes primer pools with a minimum oligo size of 40 bases.
The Primal scheme created oligos are all smaller than this size. Request a vendor contact
who can make this pool in USA.
Thanks
Venkata

Max of 400X

Hi!
We currently use ARTIC bioinformatics workflow to analyze raw nanopore SARS-CoV2 reads. We achieve a maximum of 400x coverage across regions with no overlapping amplicons for each sample. Is it normal to obtain a similar result for each sample?
Is there a limit of depth that we can change in the pipeline?
artic_example.pdf

Thanks for your answer,

Caroline

Medaka VS Nanopolish - workflows comparison

Hi,
I basecalled SP1 data with Guppy v3.5.2 using high-accuracy model, ran artic guppyplex and then ran both Nanopolish-based and Medaka-based workflows for comparing them. I got about 7 differences over the whole genome length in the consensus sequences.
The blast alignment has been split into shorter alignments due to the Ns introduced by the coverage mask, but these are the Identities of the local alignments:

Identities = 17287/17291 (99%), Gaps = 0/17291 (0%)
Identities = 5895/5897 (99%), Gaps = 0/5897 (0%)
Identities = 4953/4954 (99%), Gaps = 0/4954 (0%)
Identities = 520/520 (100%), Gaps = 0/520 (0%)

And this is the full blast report: Nanopolish_Medaka_comparison_SP1.txt.
Do you have a SP1 reference sequence available (e.g. sequenced with Illumina) to determine which one of the two workflows is more accurate? Secondly, I guess the default Medaka model is being used in the medaka workflow (r941_min_high_g351 in medaka 0.12.1), is there a parameter for selecting a different model?

Thanks in advance,
Simone

EDIT: at a closer inspection of the differences I noticed that these are Ns present in one sequence but not in the other, so these are not genuine differences.

Adapters not removed from SP1 reads

Hi,
I tried out artic pipeline v1.0.0 on SP1 test data using the following instructions, after performing basecalling:

artic guppyplex --min-length 400 --max-length 700 --directory basecalling/ --prefix SP1
mv SP1_.fastq SP1.fastq
artic minion --normalise 200 --threads 4 --scheme-directory primer_schemes --read-file SP1.fastq --fast5-directory SP1-fast5-raw --sequencing-summary basecalling/sequencing_summary.txt nCoV-2019/V1 SP1

I was not completely sure about the nCoV-2019/V1 version of the protocol, could you please confirm the specified version is correct? After the pipeline finished running successfully, I uploaded to IGV the SP1.primertrimmed.rg.sorted.bam file. I was expecting not to see any adapters or PCR primers, but based on the soft-clipping, it looks like those have not been removed. Is this a bug or it is due to the primers scheme specified not matching the actual one?

Thanks,
Simone

porechop does not install with gcc 4.8.5 envs

My default gcc for my OS is 4.8.5. This matters because when I went to use the yaml to create the conda environment for the artic protocol, artic's version of porechop did not install. I didn't realize that 1) artic was using their specific version of porechop, and 2) that it didn't install because there was no error message.

When I went to install artic's version of porechop in the environment, I came across this error:
error: [Errno 2] No such file or directory: 'porechop/cpp_functions.so'

I now knew that it was a compiling issue (needs >gcc 5) and added this to my environment:

conda install -c omgarcia gcc-6
pip3 install git+https://github.com/artic-network/[email protected]

Additionally, I realized that bcftools=1.9 and seqtk=1.3 didn't install in the conda environment either. I didn't realize this until after I went through each dependency after I realized the porechop issue.

It'd be helpful if there was a version check for each required package.

Artic minion sometimes fails consensus generation step

In one of the samples we're testing at CGR, no consensus.fasta is generated when running 'artic minion'

The alignment stages run fine, but the issue appears to arise during variant calling by medaka. The error seems to stem from a single variant at position 9514 bp (A>G), covered by nCoV-2019_31_LEFT - nCoV-2019_31_RIGHT (9204-9226 → 9557-9585).

This variant is called by both runs of Medaka, and both are subsequently merged into the single merged.vcf file (though not collapsed by position, so it is present twice in the VCF). This would be fine if both copies of the variant passed filtering, but one passes and one fails (A pretty rare event, I suspect!). Therefore, the failed position is masked in the reference during artic_mask, and then when bcftools consensus is run, the command fails with an error stating that the reference allele in the vcf (A) does not match the reference base (now an N).

I have seen this occur in one other sample too, so it is not an isolated case.

I've attached a small test case illustrating this, if you need to replicate the issue. The run.sh script shows the aritc command to run, and the input test.fastq file is the set of reads mapping 1kb +/- the variant causing this problem.

Apologies if this is not a bug, but a mistake at my end - I'm still getting to grips with the code (which is really great - especially the very recent update which speeded runs up enormously). Thanks in advance for any help, and hope you're keeping well!

Sam

EDIT: Forgot to mention that the run was with V2 primers. I tried running artic with both V2 and V3 primer schemes, and this error occured with both. Also, I'm happy to write a workaround if you think one is needed

Mean read quality calculation in guppyplex.py

In guppyplex.py you have a formula for mean read calculation:

def get_read_mean_quality(record):
return -10 * log10((10 ** (pd.Series(record.letter_annotations["phred_quality"]) / -10)).mean())

Although this is technically correct if one wants to get the mean on the probability scale, aren't these scores meant to be averaged at the log-scale (phred scale)? This severely biases the value towards lower qualities, for example the sequence of bases [10,10,10,10,10,10,10,10,2,1] would have 8.3 score on a linear scale, but has 6.53 as a result for your calculation.

None of my reads pass your default filter of 7, for example:
[4, 7, 6, 10, 3, 6, 14, 3, 3, 11, 5, 7, 5, 3, 3, 11, 6, 11, 2, 4, 4, 2, 5, 2, 3, 3, 7, 10, 3, 15, 15, 4, 3, 4, 13, 4, 4, 15, 2, 5, 8, 10, 3, 4, 3, 3, 2, 4, 5, 5, 5, 5, 2, 2, 5, 3, 3, 6, 4, 3, 2, 3, 9, 2, 5, 9, 4, 3, 4, 5, 5, 11, 10, 2, 4, 9, 2, 2, 2, 3, 5, 4, 4, 3, 10, 7, 3, 6, 5, 5, 3, 6, 10, 4, 4, 4, 4, 3, 8, 3, 9, 5, 9, 2, 7, 6, 3, 7, 4, 4, 3, 4, 2, 4, 4, 11, 4, 6, 2, 3, 3, 4, 6, 3, 4, 6, 4, 3, 5, 4, 2, 2, 4, 4, 3, 7, 2, 7, 3, 7, 8, 8, 4, 2, 8, 3, 4, 3, 4, 2, 2, 3, 10, 3, 3, 2, 7, 5, 8, 11, 3, 4, 4, 1, 2, 2, 4, 6, 2, 5, 5, 2, 3, 4, 3, 8, 2, 3, 4, 3, 4, 8, 6, 13, 13, 6, 10, 20, 17, 8, 4, 6, 7, 8, 5, 4, 3, 4, 4, 3, 3, 6, 3, 7, 6, 6, 7, 5, 3, 3, 5, 2, 3, 10, 6, 5, 3, 4, 5, 5, 7, 4, 4, 13, 7, 2, 3, 5, 2, 7, 10, 3, 4, 5, 5, 4, 10, 7, 4, 3, 8, 4, 2, 2, 1, 4, 5, 15, 6, 4, 3, 4, 3, 3, 7, 10, 4, 4, 8, 6, 5, 2, 3, 3, 3, 8, 6, 5, 6, 6, 10, 7, 11, 10, 11, 10, 8, 6, 8, 4, 6, 2, 4, 7, 8, 2, 4, 6, 3, 12, 9, 4, 10, 10, 4, 8, 4, 3, 3, 9, 12, 5, 12, 1, 3, 6, 2, 3, 1, 2, 3, 3, 2, 7, 9, 3, 9, 13, 9, 3, 8, 6, 7, 2, 8, 7, 2, 9, 7, 3, 2, 5, 3, 4, 2, 3, 2, 3, 3, 4, 4, 4, 2, 7, 2, 3, 7, 8, 3, 4, 4, 2, 5, 7, 5, 4, 4, 3, 7, 5, 7, 5, 12, 13, 3, 14, 12, 6, 8, 8, 12, 5, 4, 5, 7, 5, 8, 4, 6, 11, 8, 16, 17, 12, 3, 6, 4, 3, 3, 1, 5, 14, 20, 13, 7, 4, 3, 2, 6, 3, 2, 5, 7, 10, 3, 2, 7, 3, 6, 5, 4, 10, 5, 4, 5, 2, 3, 2, 3, 9, 3, 4, 3, 5, 4, 6, 8, 8, 10, 4, 3, 1, 4, 2, 5, 3, 5, 3, 4, 3, 5, 4]

In the meantime, this is my quality tab from MinKNOW:

Is this intended?

how to get my consensus sequence using artic-network MINion work flow using metagenomic data

Hello,

I used a metagenomic approach to sequence SARS-CoV2 genome and the data were very good with 100% coverage and 422x depth converge.
now I want to assemble my reads and get the consensus sequence and the variant file.
I want to use the workflow described here but I cant optimize my code to work with my data.
I did the basecalling and the demultiplexing for the data and I have the pass fastq files.
I want to start from the artic minion command. I tried to run the command with my data but got this error.
"
(artic-ncov2019) lab123@DESKTOP-2AI9TES:/mnt/d/Mohammad/Nanopore/Analysis/covid-19$ artic minion --normalise 200 --threads 4 --read-file /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_*.fastq --fast5-directory /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fast5_pass/barcode01/ --sequencing-summary /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/sequencing_summary_FAL01681_f475dc98.txt
usage: artic [-h] [-v]
{extract,basecaller,demultiplex,minion,gather,guppyplex,filter,rampart,export,run}
...
artic: error: unrecognized arguments: /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_100.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_101.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_102.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_103.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_104.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_105.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_106.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_107.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_108.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_109.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_11.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_110.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_12.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_13.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_14.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_15.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_16.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_17.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_18.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_19.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_2.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_20.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_21.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_22.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_23.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_24.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_25.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_26.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_27.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_28.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_29.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_3.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_30.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_31.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_32.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_33.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_34.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_35.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_36.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_37.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_38.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_39.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_4.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_40.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_41.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_42.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_43.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_44.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_45.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_46.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_47.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_48.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_49.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_5.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_50.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_51.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_52.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_53.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_54.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_55.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_56.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_57.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_58.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_59.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_6.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_60.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_61.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_62.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_63.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_64.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_65.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_66.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_67.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_68.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_69.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_7.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_70.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_71.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_72.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_73.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_74.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_75.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_76.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_77.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_78.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_79.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_8.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_80.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_81.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_82.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_83.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_84.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_85.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_86.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_87.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_88.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_89.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_9.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_90.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_91.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_92.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_93.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_94.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_95.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_96.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_97.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_98.fastq /mnt/c/data/covid-19/covid-19/1/20201102_1452_MN31812_FAL01681_d9c4ed35/fastq_pass/barcode01/FAL01681_pass_barcode01_f475dc98_99.fastq
"
how could I fix it?
and for the "scheme-directory" I don't have scheme directory because I didn't use artic network primers. I used SISPA A/B method

barcodes removed during basecalling

What about ONT runs that use the demultiplexing and barcoding trimming option when starting a run?

Issue with coverage == 20 and low-freq variants in consensus sequence

Hi,
I am running ARTIC pipeline v1.2.1, and I found that there may be an issue in case an amplicon coverage is exactly equal to 20. In that case, Nanopolish is not calling the variant (nanopolish variants --min-candidate-depth parameter defaults to 20), while artic_make_depth_mask is not masking the amplicon (probably because coverage is >= 20), therefore the consensus is not masked and the variant is not introduced erroneusly. Probably it may be better to set minimum coverage requirements strictly >20, to avoid such edge cases. What do you think?
Thanks,
Simone

Scheme reference file not found:

when i run this code: (artic minion --normalise 200 --threads 4 --scheme-directory ~/artic-ncov2019/primer_schemes --read-file run_name_barcode03.fastq --fast5-directory path_to_fast5 --sequencing-summary path_to_sequencing_summary.txt nCoV-2019/V3 samplename):

I get this error: (Scheme reference file not found: /Users/aroobalhumaidy/artic-ncov2019/nCoV-2019/V3/nCoV-2019.reference.fasta)

Bug gather.py, testing with the provided "simulated_reads"

Hi, I'm trying reproduce the protocol following the detailed guidelines . After run:
artic gather --min-length 400 --max-length 700 --prefix testing --directory ~/artic-ncov2019/simulated_reads/
I get the error:

Traceback (most recent call last):
File "/home/ariel/anaconda3/envs/artic-ncov2019/bin/artic", line 8, in
sys.exit(main())
File "/home/ariel/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/pipeline.py", line 116, in main
args.func(parser, args)
File "/home/ariel/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/pipeline.py", line 32, in run_subtool
submodule.run(parser, args)
File "/home/ariel/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/gather.py", line 94, in run pd.concat(dfs).to_csv(summaryfh, sep="\t", index=False)
ValueError: No objects to concatenate

(the same happens with real data)

Thanks in advance,
A.

which version to use

May I ask the difference of this pipeline and https://github.com/artic-network/fieldbioinformatics? It seems the later one updated more frequently, but this version is the one used in ARTIC network SOP https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html

Any suggestions? Thanks.

AssertionError: error: readgroup not found in provided primer scheme (1)

Hi,
I'm testing the pipeline and everything goes well until :
artic_plot_amplicon_depth --primerScheme /home/linux/programmes/artic-ncov2019/primer_schemes/ZaireEbola V1/ZaireEbola.scheme.bed --sampleID ebov-mayinga --outFilePrefix ebov-mayinga ebov-mayinga*.depths

Traceback (most recent call last):
File "/home/linux/miniconda3/envs/artic-ncov2019/bin/artic_plot_amplicon_depth", line 10, in
sys.exit(main())
File "/home/linux/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/plot_amplicon_depth.py", line 143, in main
go(args)
File "/home/linux/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/plot_amplicon_depth.py", line 76, in go
rg)

AssertionError: error: readgroup not found in provided primer scheme (1)

Command failed:artic_plot_amplicon_depth --primerScheme /home/linux/programmes/artic-ncov2019/primer_schemes/ZaireEbola/V1/ZaireEbola.scheme.bed --sampleID ebov-mayinga --outFilePrefix ebov-mayinga ebov-mayinga*.depths

How can I solve this?
Thk

how to change min reads depth for consensus assembly?

From the documentation, any position that is not covered by at least 20 reads are marked as low coverage and changed to "N". If I want to tweak this parameter, say lower it to 15, which file and line I shall change for artic? Thank you for any help.

Aric MinION pipeline - Nanopolish with fast5 in subfolder

Hi,
the new versions of MinKNOW allows to demultiplex with the option "requiring barcodes at both ends", which is what we need for the artic pipeline.
However, recently not just the fastq files but also the fast5 files will be in subfolders devided by barcode.
e.g. path_to_fast5/barcode01/.fast5; path_to_fast5/barcode02/.fast5 etc.

In the artic pipeline in the following step:

artic minion --normalise 200 --threads 4 --scheme-directory ~/artic-ncov2019/primer_schemes --read-file run_name_barcode03.fastq --fast5-directory path_to_fast5 --sequencing-summary path_to_sequencing_summary.txt nCoV-2019/V3 samplename

Do we have to point to the fast5 folder or e.g. fast5/barcode01 folder. Or is there some kind of -r (recursive) option?

I don't really get how nanopolish matches the fast5 and the fastq and I am worried polishing will fail if the path is not set the right way.

Thanks!

Create environment with Conda

Hi I tried to install the artic-ncov2019 workflow as was stated on the website (https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html)
but when I tried the command
conda env create -f environment.yml

I always get the following:

Collecting package metadata (repodata.json): failed

CondaHTTPError: HTTP 500 INTERNAL SERVER ERROR for url <https://conda.anaconda.org/artic-network/linux-64/repodata.json>
Elapsed: 00:14.503821
CF-RAY: 61721b244fd6dfdb-FRA

A remote server error occurred when trying to retrieve this URL.

A 500-type error (e.g. 500, 501, 502, 503, etc.) indicates the server failed to
fulfill a valid request.  The problem may be spurious, and will resolve itself if you
try your request again.  If the problem persists, consider notifying the maintainer
of the remote server.

What is wrong? If I update conda with (conda update conda) this is possible without a problem. So it seems the server problem occurs only with the artic environment.yml

scheme download failed

Rampart Export Reads for sample

When I try the "Export Reads" for a sample from the rampart webpage, I get the following error:

[warning] pipeline (Export reads) finished with exit code 1. Error messages:
[warning] AttributeError in line 40 of /home/orto01r/anaconda3/envs/artic-ncov2019/lib/node_modules/artic-rampart/default_protocol/pipelines/bin_to_fastq/Snakefile:
'collections.OrderedDict' object has no attribute 'read'
File "/home/orto01r/anaconda3/envs/artic-ncov2019/lib/node_modules/artic-rampart/default_protocol/pipelines/bin_to_fastq/Snakefile", line 40, in
File "/home/orto01r/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/yaml/init.py", line 162, in safe_load
File "/home/orto01r/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/yaml/init.py", line 112, in load
File "/home/orto01r/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/yaml/loader.py", line 34, in init
File "/home/orto01r/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/yaml/reader.py", line 85, in init
File "/home/orto01r/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/yaml/reader.py", line 124, in determine_encoding
File "/home/orto01r/anaconda3/envs/artic-ncov2019/lib/python3.6/site-packages/yaml/reader.py", line 178, in update_raw

I'm on Ubuntu 18, mozilla firefox. The visualisations seems to have worked fine, and there are plenty of reads for each of the samples

primer BED file isn't in IVAR format

https://andersen-lab.github.io/ivar/html/manualpage.html

They need a score in col 4 and a strand in col 5 ?

Puerto  28  52  400_1_out_L 60  +
Puerto  482 504 400_1_out_R 60  -
Puerto  359 381 400_2_out_L 60  +
Puerto  796 818 400_2_out_R 60  -
Puerto  658 680 400_3_out_L*    60  +
Puerto  1054    1076    400_3_out_R*    60  -

You have

MN908947        30      54      nCoV-2019_1_LEFT        nCoV-2019_1
MN908947        385     410     nCoV-2019_1_RIGHT       nCoV-2019_1
MN908947        320     342     nCoV-2019_2_LEFT        nCoV-2019_2
MN908947        704     726     nCoV-2019_2_RIGHT       nCoV-2019_2
MN908947        642     664     nCoV-2019_3_LEFT        nCoV-2019_1
MN908947        1004    1028    nCoV-2019_3_RIGHT       nCoV-2019_1
MN908947        943     965     nCoV-2019_4_LEFT        nCoV-2019_2
MN908947        1312    1337    nCoV-2019_4_RIGHT       nCoV-2019_2
MN908947        1242    1264    nCoV-2019_5_LEFT        nCoV-2019_1
MN908947        1623    1651    nCoV-2019_5_RIGHT       nCoV-2019_1

No variant.tab output

I don’t have the variant output. Was there any step that I missed?

Error when running the MinION pipeline command.

Hello,
Whenever i am running the minION pipeline command I run into an error which I am unable to decipher and, hence, troubleshoot.
In the following, please find the exact error I am getting:

Command Used:

artic minion --normalise 200 --scheme-directory ~/artic-ncov2019/primer_schemes/ --read-file Aug24_sample1.fastq --fast5-directory ../../../fast5_skip/ --sequencing-summary ../../../Basecalling_skipped/sequencing_summary.txt ~/artic-ncov2019/primer_schemes/nCoV-2019/V3/ Sample1

Error message:

Traceback (most recent call last):
File "/home/bio/miniconda3/envs/artic-ncov2019/bin/artic", line 10, in
sys.exit(main())
File "/home/bio/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/pipeline.py", line 216, in main
args.func(parser, args)
File "/home/bio/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/pipeline.py", line 35, in run_subtool
submodule.run(parser, args)
File "/home/bio/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/minion.py", line 23, in run
scheme_name, scheme_version = args.scheme.split('/')
ValueError: too many values to unpack (expected 2)

N.B: Please note that the artic environment was getting stuck at solving environment with the 64-bit python 3.6 version of miniconda. Therefore it was installed via the latest version of miniconda3 (with Py3.8). I am unsure if this is contributing to the error. Also, please do excuse the hectic directory arrangement in the command.

Thank you very much.
Sincerely, Georgi Merhi.

Creating conda environment hangs on "Solving environment: |"

Solved. Sorry, was a network issue at my institution, was working on Wifi.

align_trim.py error using V3 scheme bed file

When we update to V3 primmer , if we use V3/nCoV-2019.scheme.bed. the align_trim step has an error occured as following:
line 96, in find_primer
closest = min([(abs(p['start'] - pos), p['start'] - pos, p) for p in bed if p['direction'] == direction], key=itemgetter(0))
ValueError: min() arg is an empty sequence
but if we change the bed file to V1/nCoV-2019.scheme.bed or V3/nCoV-2019.bed , there're no error occured.
So, is there something wrong in my pipline?

minion consensus fail: fasta sequence does not match REF allele at...

Hi all,

When I run minion for a specific sample I get an error that says:

Command failed:bcftools consensus -f results/minion/barcode72.preconsensus.fasta results/minion/barcode72.pass.vcf.gz -m results/minion/barcode72.coverage_mask.txt -o results/minion/barcode72.consensus.fasta

When I look further along in the file capturing the stderr and stdout I see this:

<...>
2021-04-13 01:14:12 Calling initial genotypes using pair-HMM realignment...
Note: the --sample option not given, applying all records regardless of the genotype
The fasta sequence does not match the REF allele at MN908947.3:16175:
   .vcf: [CT] <- (REF)
   .vcf: [C] <- (ALT)
   .fa:  [CN]TCAAGGTATTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAATTATGTCTTTACTGGTTATCGTGTAACTAAAAACAGTAAAGTACAAATAGGAGAGTACACCTTTGAAAAAGGTGACTATGGTGATGCTGTTGTTTACCGAGGTACAACAACTTACAAATTAAATGTTGGTGATTATTTTGTGCTGACATCACATACAGTAATGCCATTAAGTGCACCTACACTAGTGCCACAAGAGCACTATGTTAGAATTACTGGCTTATACCCAACACTCAATATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGGTTGGTATGCAAAAGTATTCTACACTCCAGGGACCACCTGGTACTGGTAAGAGTCATTTTGCTATTGGCCTAGCTCTCTACTACCCTTCTGCTCGCATAGTGTATACAGCTTGCTCTCATGCCGCTGTTGATGCACTATGTGAGAAGGCATTAAAATATTTGCCTATAGATAAATGTAGTAGAATTATACCTGCACGTGCTCGTGTAGAGTGTTTTGATAAATTCAAAGTGAATTCAACATTAGAACAGTATGTCTTTTGTACTGTAAATGCATTGCCTGAGACGACAGCAGATATAGTTGTCTTTGATGAAATTTCAATGGCCACAAATTATGATTTGAGTGTTGTCAATGCCAGATTACGTGCTAAGCACTATGTGTACATTGGCGACCCTGCTCAATTACCTGCACCACGCACATTGCTAACTAAGGGCACACTAGAACCAGAATATTTCAATTCAGTGTGTAGACTTATGAAAACTATAGGTCCAGACATGTTCCTCGGAACTTGTCGGCGTTGTCCTGCTGAAATTGTTGACACTGTGAGTGCTTTGGTTTATGATAATANGCTTAAAGCACATAAAGACAAATCAGCTCAATGCTTTAAAATGTTTTATAAGGGTGTTATCACGCATGATGTTTCATCTGCAATTAACAGGCCACAAATAGGCGTGGTAAGAGAATTCCTTACACGTAACCCTGCTTGGAGAAAAGCTGTCTTTATTTCACCTTATAATTCACAGAATGCTGTAGCCTCAAAGATTTTGGGACTACCAACTCAAACTGTTGATTCATCACAGGGCTCAGAATATGACTATGTCATATTCACTCAAACCACTGAAACAGCTCACTCTTGTAATGTAAACAGATTTAATGTTGCTATTACCAGAGCAAAAGTAGGCATACTTTGCATAATGTCTGATAGAGACCTTTATGACAAGTTGCAATTTACAAGTCTTGAAATTCCACGTAGGAATGTGGCAACTTTACAAGCTGAAAATGTAACAGGACTCTTTAAAGATTGTAGTAAGGTAATCACTGGGTTACATCCTACACAGGCACCTACACACCTCAGTGTTGACACTAAATTCAAAACTGAAGGTTTATGTGTTGACATACCTGGCATACCTAAGGACATGACCTATAGAAGACTCATCTCTATGATGGGTTTTAAAATGAATTATCAAGTTAATGGTTACCCTAACATGTTTATCACCCGCGAAGAAGCTATAAGACATGTACGTGCATGGATTGGCTTCGATGTCGAGGGGTGTCATGCTACTAGAGAAGCTGTTGGTACCAATTTACCTTTACAGCTAGGTTTTTCTACAGGTGTTAACCTAGTTGCTGTACCTACAGGTTATGTTGATACACCTAATAATACAGATTTTTCCAGAGTTAGTGCTAAACCACCGCCTGGAGATCAATTTAAACACCTCATACCACTTATGTACAAAGGACTTCCTTGGAATGTAGTGCGTATAAAGATTGTACAAATGTTAAGTGACACACTTAAAAATCTCTCTGACAGAGTCGTATTTGTCTTATGGGCACATGGCTTTGAGTTGACATCTATGAAGTATTTTGTGAAAATAGGACCTGAGCGCACCTGTTGTCTATGTGATAGACGTGCCACATGCTTTTCCACTGCTTCAGACACTTATGCCTGTTGGCATCATTCTATTGGATTTGATTACGTCTATAATCCGTTTATGATTGATGTTCAACAATGGGGTTTTACAGGTAACCTACAAAGCAACCATGATCTGTATTGTCAAGTCCATGGTAATGCACATGTAGCTAGTTGTGATGCAATCATGACTAGGTGTCTAGCTGTCCACGAGTGCTTTGTTAAGCGTGTTGACTGGACTATTGAATATCCTATAATTGGTGATGAACTGAAGATTAATGCGGCTTGTAGAAAGGTTCAACACATGGTTGTTAAAGCTGCATTATTAGCAGACAAATTCCCAGTTCTTCACGACATTGGTAACCCTAAAGCTATTAAGTGTGTACCTCAAGCTGATGTAGAATGGAAGTTCTATGATGCACAGCCTTGTAGTGACAAAGCTTATAAAATAGAAGAATTATTCTATTCTTATGCCACACATTCTGACAAATTCACAGATGGTGTATGCCTATTTTGGAATTGCAATGTCGATAGATATCCTGCTAATTCCATTGTTTGTAGATTTGACACTAGAGTGCTATCTAACCTTAACTTGCCTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCTGTTATTGATTTATTACTTGATGATTTTGTTGAAATAATAAAATCCCAAGATTTATCTGTAGTTTCTAAGGTTGTCAAAGTGACTATTGACTATACAGAAATTTCATTTATGCTTTGGTGTAAAGATGGCCATGTAGAAACATTTTACCCAAAATTACAATCTAGTCAAGCGTGGCAACCGGGTGTTGCTATGCCTAATCTTTACAAAATGCAAAGAATGCTATTAGAAAAGTGTGACCTTCAAAATTATGGTGATAGTGCAACATTACCTAAAGGCATAATGATGAATGTCGCAAAATATACTCAACTGTGTCAATATTTAAACACATTAACATTAGCTGTACCCTATAATATGAGAGTTATACATTTTGGTGCTGGTTCTGATAAAGGAGTTGCACCAGGTACAGCTGTTTTAAGACAGTGGTTGCCTACGGGTACGCTGCTTGTCGATTCAGATCTTAATGACTTTGTCTCTGATGCAGATTCAACTTTGATTGGTGATTGTGCAACTGTACATACAGCTAATAAATGGGATCTCATTATTAGTGATATGTACGACCCTAAGACTAAAAATGTTACAAAAGAAAATGACTCTAAAGAGGGTTTTTTCACTTACATTTGTGGGTTTATACAACAAAAGCTAGCTCTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAAAGTTTTCAGATCCTCAGTTTTACATTCAACTCAGGACTTGTTCTTACCTTTCTTTTCCAATGTTACTTGGTTCCATGCTATACATGTCTCTGGGACCAATGGTACTAAGAGGTTTGATAACCCTGTCCTACCATTTAATGATGGTGTTTATTTTGCTTCCACTGAGAAGTCTAACATAATAAGAGGCTGGATTTTTGGTACTACTTTAGATTCGAAGACCCAGTCCCTACTTATTGTTAATAACGCTACTAATGTTGTTATTAAAGTCTGTGAATTTCAATTTTGTAATGATCCATTTTTGGGTGTTTATTACCACAAAAACAACAAAAGTTGGATGGAAAGTGAGTTCAGAGTTTATTCTAGTGCGAATAATTGCACTTTTGAATATGTCTCTCAGCCTTTTCTTATGGACCTTGAAGGAAAACAGGGTAATTTCAAAAATCTTAGGGAATTTGTGTTTAAGAATATTGATGGTTATTTTAAAATATATTCTAAGCACACGCCTATTAATTTAGTGCGTGATCTCCCTCAGGGTTTTTCGGCTTTAGAACCATTGGTAGATTTGCCAATAGGTATTAACATCACTAGGTTTCAAACTTTACTTGCTTTACATAGAAGTTATTTGACTCCTGGTGATTCTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTAGATTTCCTAATATTACAAACTTGTGCCCTTTTGGTGAAGTTTTTAACGCCACCAGATTTGCATCTGTTTATGCTTGGAACAGGAAGAGAATCAGCAACTGTGTTGCTGATTATTCTGTCCTATATAATTCCGCATCATTTTCCACTTTTAAGTGTTATGGAGTGTCTCCTACTAAATTAAATGATCTCTGCTTTACTAATGTCTATGCAGATTCATTTGTAATTAGAGGTGATGAAGTCAGACAAATCGCTCCAGGGCAAACTGGAAAGATTGCTGATTATAATTATAAATTACCAGATGATTTTACAGGCTGCGTTATAGCTTGGAATTCTAACAATCTTGATTCTAAGGTTGGTGGTAATTATAATTACCTGTATAGATTGTTTAGGAAGTCTAATCTCAAACCTTTTGAGAGAGATATTTCAACTGAAATCTATCAGGCCGGTAGCACACCTTGTAATGGTGTTGAAGGTTTTAATTGTTACTTTCCTTTACAATCATATGGTTTCCAACCCACTAATGGTGTTGGTTACCAACCATACAGAGTAGTAGTACTTTCTTTTGAACTTCTACATGCACCAGCAACTGTTTGTGGACCTAAAAAGTCTACTAATTTGGTTAAAAACAAATGTGTCAATTTCAACTTCAATGGTTTAACAGGCACAGGTGTTCTTACTGAGTCTAACAAAAAGTTTCTGCCTTTCCAACAATTTGGCAGAGACATTGCTGACACTACTGATGCTGTCCGTGATCCACAGACACTTGAGATTCTTGACATTACACCATGTTCTTTTGGTGGTGTCAGTGTTATAACACCAGGAACAAATACTTCTAACCAGGTTGCTGTTCTTTATCAGGATGTTAACTGCACAGAAGTCCCTGTTGCTATTCATGCAGATCAACTTACTCCTACTTGGCGTGTTTATTCTACAGGTTCTAATGTTTTTCAAACACGTGCAGGCTGTTTAATAGGGGCTGAACATGTCAACAACTCATATGAGTGTGACATACCCATTGGTGCAGGTATATGCGCTAGTTATCAGACTCAGACTAATTCTCCTCGGCGGGCACGTAGTGTAGCTAGTCAATCCATCATTGCCTACACTATGTCACTTGGTGCAGAAAATTCAGTTGCTTACTCTAATAACTCTATTGCCATACCCANAAATTTTACTATTAGTGTTACCACAGAAATTCTACCAGTGTCTATGACCAAGACATCAGTAGATTGTACAATGTACATTTGTGGTGATTCAACTGAATGCAGCAATCTTTTGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGAGACCTCATTTGTGCACAAAAGTTTAACGGCCTTACTGTTTTGCCACCTTTGCTCACAGATGAAATGATTGCTCAATACACTTCTGCACTGTTAGCGGGTACAATCACTTCTGGTTGGACCTTTGGTGCAGGTGCTGCATTACAAATACCATTTGCTATGCAAATGGCTTATAGGTTTAATGGTATTGGAGTTACACAGAATGTTCTCTATGAGAACCAAAAATTGATTGCCAACCAATTTAATAGTGCTATTGGCAAAATTCAAGACTCACTTTCTTCCACAGCAAGTGCACTTGGAAAACTTCAAGATGTGGTCAACCAAAATGCACAAGCTTTAAACACGCTTGTTAAACAACTTAGCTCCAATTTTGGTGCAATTTCAAGTGTTTTAAATGATATCCTTTCACGTCTTGACAAAGTTGAGGCTGAAGTGCAAATTGATAGGTTGATCACAGGCAGACTTCAAAGTTTGCAGACATATGTGACTCAACAATTAATTAGAGCTGCAGAAATCAGAGCTTCTGCTAATCTTGCTGCTACTAAAATGTCAGAGTGTGTACTTGGACAATCAAAAAGAGTTGATTTTTGTGGAAAGGGCTATCATCTTATGTCCTTCCCTCAGTCAGCACCTCATGGTGTAGTCTTCTTGCATGTGACTTATGTCCCTGCACAAGAAAAGAACTTCACAACTGCTCCTGCCATTTGTCATGATGGAAAAGCACACTTTCCTCGTGAAGGTGTCTTTGTTTCAAATGGCACACACTGGTTTGTAACACAAAGGAATTTTTATGAACCACAAATCATTACTACAGACAACACATTTGTGTCTGGTAACTGTGATGTTGTAATAGGAATTGTCAACAACACAGTTTATGATCCTTTGCAACCTGAATTAGACTCATTCAAGGAGGAGTTAGATAAATATTTTAAGAATCATACATCACCAGATGTTGATTTAGGTGACATCTCTGGCATTAATGCTTCAGTTGTAAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTTGCCAAGAATTTAAATGAATCTCTCATCGATCTCCAAGAACTTGGAAAGTATGAGCAGTATATAAAATGGCCATGGTACATTTGGCTAGGTTTTATAGCTGGCTTGATTGCCATAGTAATGGTGACAATTATGCTTTGCTGTATGACCAGTTGCTGTAGTTGTCTCAAGGGCTGTTGTTCTTGTGGATCCTGCTGCAAATTTGATGAAGACGACTCTGAGCCAGTGCTCAAAGGAGTCAAATTACATTACACATAAACGAACTTATGGATTTGTTTATGAGAATCTTCACAATTGGAACTGTAACTTTGAAGCAAGGTGAAATCAAGGATGCTACTCCTTCAGATTTTGTTCGCGCTACTGCAACGATACCGATACAAGCCTCACTCCCTTTCGGATGGCTTATTGTTGGCGTTGCACTTCTTGCTGTTTTTCAGAGCGCTTCCAAAATCATAACCCTCAAAAAGAGATGGCAACTAGCACTCTCCAAGGGTGTTCACTTTGTTTGCAACTTGCTGTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTATTTCTGAACATGACTACCAGATTGGTGGTTATACTGAAAAATGGGAATCTGGAGTAAAAGACTGTGTTGTATTACACAGTTACTTCACTTCAGACTATTACCAGCTGTACTCAACTCAATTGAGTACAGACACTGGTGTTGAACATGTTACCTTCTTCATCTACAATAAAATTGTTGATGAGCCTGAAGAACATGTCCAAATTCACACAATCGACGGTTCATCCGGAGTTGTTAATCCAGTAATGGAACCAATTTATGATGAACCGACGACGACTACTAGCGTGCCTTTGTAAGCACAAGCTGATGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTAATAGTTAATAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTTACACTAGCCATCCTTACTGCGCTTCGATTGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTCTTGTAAAACCTTCTTTTTACGTTTACTCTCGTGTTAAAAATCTGAATTCTTCTAGAGTTCCTGATCTTCTGGTCTAAACGAACTAAATATTATATTAGTTTTTCTGTTTGGAACTTTAATTTTAGCCATGGCAGATTCCAACGGTACTATTACCGTTGAAGAGCTTAAAAAGCTCCTTGAACAATGGAACCTAGTAATAGGTTTCCTATTCCTTACATGGATTTGTCTTCTACAATTTGCCTATGCCAACAGGAATAGGTTTTTGTATATAATTAAGTTAATTTTCCTCTGGCTGTTATGGCCAGTAACTTTAGCTTGTTTTGTGCTTGCTGCTGTTTACAGAATAAATTGGATCACCGGTGGAATTGCTATCGCAATGGCTTGTCTTGTANGCTTGATGTGGCTCAGCTACTTCATTGCTTCTTTCAGACTGTTTGCGCGTACGCGTTCCATGTGGTCATTCAATCCAGAAACTAACATTCTTCTCAACGTGCCACTCCATGGCACTATTCTGACCAGACCGCTTCTAGAAAGTGAACTCGTAATCGGAGCTGTGATCCTTCGTGGACATCTTCGTATTGCTGGACACCATCTAGGACGCTGTGACATCAAGGACCTGCCTAAAGAAATCACTGTTGCTACATCACGAACGCTTTCTTATTACAAATTGGGAGCTTCGCAGCGTGTAGCAGGTGACTCAGGTTTTGCTGCATACAGTCGCTACAGGATTGGCAACTATAAATTAAACACAGACCATTCCAGTAGCAGTGACAATATTGCTTTGCTTGTACAGTAAGTGACAACAGATGTTTCATCTCGTTGACTTTCAGGTTACTATAGCAGAGATATTACTAATTATTATGAGGACTTTTAAAGTTTCCATTTGGAATCTTGATTACATCATAAACCTCATAATTAAAAATTTATCTAAGTCACTAACTGAGAATAAATATTCTCAATTAGATGAAGAGCAACCAATGGAGATTGATTAAACGAACATGAAAATTATTCTTTTCTTGGCACTGATAACACTCGCTACTTGTGAGCTTTATCACTACCAAGAGTGTGTTAGAGGTACAACAGTACTTTTAAAAGAACCTTGCTCTTCTGGAACATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTTCAGTNCATCGATATCGGTAATTATACAGTTTCCTGTTTACCTTTTACAATTAATTGCCAGGAACCTAAATTGGGTAGTCTTGTAGTGCGTTGTTCGTTCTATGAAGACTTTTTAGAGTATCATGACGTTCGTGTTGTTTTAGATTTCATCTAAACGAACAAACTAAAATGTCTGATAATGGACCCCAAAATCAGCGAAATGCACCCCGCATTACGTTTGGTGGACCCTCAGATTCAACTGGCAGTAACCAGAATGGAGAACGCAGTGGGGCGCGATCAAAACAACGTCGGCCCCAAGGTTTACCCAATAATACTGCGTCTTGGTTCACCGCTCTCACTCAACATGGCAAGGAAGACCTTAAATTCCCTCGAGGACAAGGCGTTCCAATTAACACCAATAGCAGTCCAGATGACCAAATTGGCTACTACCGAAGAGCTACCAGACGAATTCGTGGTGGTGACGGTAAAATGAAAGATCTCAGTCCAAGATGGTATTTCTACTACCTAGGAACTGGGCCAGAAGCTGGACTTCCCTATGGTGCTAACAAAGACGGCATCATATGGGTTGCAACTGAGGGAGCCTTGAATACACCAAAAGATCACATTGGCACCCGCAATCCTGCTAACAATGCTGCAATCGTGCTACAACTTCCTCAAGGAACAACATTGCCAAAAGGCTTCTACGCAGAAGGGAGCAGAGGCGGCAGTCAAGCCTCTTCTCGTTCCTCATCACGTAGTCGCAACAGTTCAAGAAATTCAACTCCAGGCAGCAGTAGGGGAACTTCTCCTGCTAGAATGGCTGGCAATGGCGGTGATGCTGCTCTTGCTTTGCTGCTGCTTGACAGATTGAACCAGCTTGAGAGCAAAATGTCTGGTAAAGGCCAACAACAACAAGGCCAAACTGTCACTAAGAAATCTGCTGCTGAGGCTTCTAAGAAGCCTCGGCAAAAACGTACTGCCACTAAAGCATACAATGTAACACAAGCTTTCGGCAGACGTGGTCCAGAACAAACCCAAGGAAATTTTGGGGACCAGGAACTAATCAGACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAATTTGCCCCCAGCGCTTCAGCGTTCTTCGGAATGTCGCGCATTGGCATGGAAGTCACACCTTCGGGAACGTGGTTGACCTACACAGGTGCCATNAAATTGGATGACAAAGATCCAAATTTCAAAGATCAAGTCATTTTGCTGAATAAGCATATTGACGCATACAAANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
<...>

When I look at the alignment files, it does seem like there is a variant here (indel?). I am not exactly sure how to interpret this message. I have gone through a number of samples and this is the first time I have seen this error. Any suggestions? My env.yml is below

name: artic-ncov2019
channels:
  - artic-network
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=1_gnu
  - absl-py=0.11.0=py36h5fab9bb_0
  - appdirs=1.4.4=pyh9f0ad1d_0
  - args=0.1.0=py36h9f0ad1d_1003
  - artic=1.2.1=py_0
  - artic-porechop=0.3.2pre=py36hf1ae8f4_1
  - artic-tools=0.2.6=hee4d88c_0
  - astor=0.8.1=pyh9f0ad1d_0
  - attrs=20.3.0=pyhd3deb0d_0
  - bcftools=1.10.2=h4f4756c_3
  - biopython=1.78=py36h8f6f2f9_1
  - blas=2.14=openblas
  - boost-cpp=1.70.0=h7b93d67_3
  - brotlipy=0.7.0=py36h8f6f2f9_1001
  - bwa=0.7.17=hed695b0_7
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.17.1=h36c2ea0_0
  - ca-certificates=2020.12.5=ha878542_0
  - certifi=2020.12.5=py36h5fab9bb_1
  - cffi=1.14.5=py36hc120d54_0
  - chardet=4.0.0=py36h5fab9bb_1
  - click=7.1.2=pyh9f0ad1d_0
  - clint=0.5.1=py_1
  - coloredlogs=15.0=py36h5fab9bb_0
  - colormath=3.0.0=py_2
  - configargparse=1.3=pyhd8ed1ab_0
  - cryptography=3.4.4=py36hc39840e_0
  - cycler=0.10.0=py_2
  - datrie=0.8.2=py36h8c4c3a4_1
  - decorator=4.4.2=py_0
  - docutils=0.16=py36h5fab9bb_3
  - eigen=3.3.9=h4bd325d_1
  - freetype=2.10.4=h0708190_1
  - future=0.18.2=py36h5fab9bb_3
  - gast=0.4.0=pyh9f0ad1d_0
  - gitdb=4.0.5=pyhd8ed1ab_1
  - gitpython=3.1.13=pyhd8ed1ab_0
  - google-pasta=0.2.0=pyh8c360ce_0
  - grpcio=1.35.0=py36h8e87921_0
  - gsl=2.6=he838d99_2
  - h5py=2.7.1=py36_1
  - hdf5=1.8.18=3
  - htslib=1.10.2=hd3b49d5_1
  - humanfriendly=9.1=py36h5fab9bb_0
  - icu=67.1=he1b5a44_0
  - idna=2.10=pyh9f0ad1d_0
  - importlib-metadata=3.7.0=py36h5fab9bb_0
  - importlib_metadata=3.7.0=hd8ed1ab_0
  - iniconfig=1.1.1=pyh9f0ad1d_0
  - intervaltree=3.0.2=py_0
  - isa-l=2.30.0=ha770c72_2
  - jinja2=2.11.3=pyh44b312d_0
  - jpeg=9d=h36c2ea0_0
  - jsonschema=3.2.0=py_2
  - k8=0.2.5=he513fc3_0
  - keras-applications=1.0.8=py_1
  - keras-preprocessing=1.1.2=pyhd8ed1ab_0
  - kiwisolver=1.3.1=py36h605e78d_1
  - krb5=1.17.2=h926e7f8_0
  - lcms2=2.12=hddcbb42_0
  - ld_impl_linux-64=2.35.1=hea4e1c9_2
  - libblas=3.8.0=14_openblas
  - libcblas=3.8.0=14_openblas
  - libcurl=7.71.1=hcdd3856_8
  - libdeflate=1.6=h516909a_0
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libffi=3.3=h58526e2_2
  - libgcc-ng=9.3.0=h2828fa1_18
  - libgfortran=3.0.0=1
  - libgfortran-ng=7.5.0=h14aa051_18
  - libgfortran4=7.5.0=h14aa051_18
  - libgomp=9.3.0=h2828fa1_18
  - libiconv=1.16=h516909a_0
  - liblapack=3.8.0=14_openblas
  - liblapacke=3.8.0=14_openblas
  - libnghttp2=1.43.0=h812cca2_0
  - libopenblas=0.3.7=h5ec1e0e_6
  - libpng=1.6.37=h21135ba_2
  - libprotobuf=3.15.2=h780b84a_0
  - libssh2=1.9.0=hab1572f_5
  - libstdcxx-ng=9.3.0=h6de172a_18
  - libtiff=4.2.0=hdc55705_0
  - libuv=1.40.0=h7f98852_0
  - libwebp-base=1.2.0=h7f98852_0
  - llvm-meta=7.0.0=0
  - longshot=0.4.1=h80880c6_0
  - lz4-c=1.9.3=h9c3ff4c_0
  - lzstring=1.0.4=py_1001
  - mappy=2.17=py36h955c1b8_2
  - markdown=3.3.3=pyh9f0ad1d_0
  - markupsafe=1.1.1=py36h8f6f2f9_3
  - matplotlib-base=3.3.2=py36h5ffbc53_0
  - medaka=1.0.3=py36hbecb4b7_1
  - minimap2=2.17=hed695b0_3
  - more-itertools=8.7.0=pyhd8ed1ab_0
  - multiqc=1.9=py_1
  - muscle=3.8.1551=hc9558a2_5
  - nanopolish=0.13.2=he3b7ca5_2
  - ncurses=6.2=h58526e2_4
  - networkx=2.5=py_0
  - nodejs=15.2.1=h914e61d_0
  - numpy=1.16.1=py36h99e49ec_1
  - numpy-base=1.16.1=py36h2f8d375_1
  - olefile=0.46=pyh9f0ad1d_1
  - ont-fast5-api=3.3.0=py_0
  - openmp=7.0.0=h2d50403_0
  - openssl=1.1.1j=h7f98852_0
  - packaging=20.9=pyh44b312d_0
  - pandas=0.25.3=py36hb3f55d8_0
  - parasail-python=1.2.4=py36hd181a71_0
  - perl=5.32.0=h36c2ea0_0
  - pigz=2.5=h27826a3_0
  - pillow=8.1.0=py36ha6010c0_2
  - pip=21.0.1=pyhd8ed1ab_0
  - pluggy=0.13.1=py36h5fab9bb_4
  - progressbar33=2.4=py_0
  - protobuf=3.15.2=py36hc4f0c31_0
  - psutil=5.8.0=py36h8f6f2f9_1
  - py=1.10.0=pyhd3deb0d_0
  - pycparser=2.20=pyh9f0ad1d_2
  - pyfaidx=0.5.9.2=pyh3252c3a_0
  - pyopenssl=20.0.1=pyhd8ed1ab_0
  - pyparsing=2.4.7=pyh9f0ad1d_0
  - pyrsistent=0.17.3=py36h8f6f2f9_2
  - pysam=0.16.0.1=py36h4c34d4e_1
  - pysocks=1.7.1=py36h5fab9bb_3
  - pytest=6.2.2=py36h5fab9bb_0
  - python=3.6.13=hffdb5ce_0_cpython
  - python-dateutil=2.8.1=py_0
  - python-isal=0.5.0=py36h8f6f2f9_0
  - python_abi=3.6=1_cp36m
  - pytz=2021.1=pyhd8ed1ab_0
  - pyvcf=0.6.8=py36h9f0ad1d_1002
  - pyyaml=5.4.1=py36h8f6f2f9_0
  - rampart=1.2.0=0
  - ratelimiter=1.2.0=py_1002
  - readline=8.0=he28a2e2_2
  - requests=2.25.1=pyhd3deb0d_0
  - samtools=1.10=h2e538c0_3
  - scipy=1.5.1=py36h2d22cac_0
  - setuptools=49.6.0=py36h5fab9bb_3
  - simplejson=3.17.2=py36h8f6f2f9_2
  - six=1.15.0=pyh9f0ad1d_0
  - smmap=3.0.5=pyh44b312d_0
  - snakemake-minimal=5.8.1=py_0
  - sortedcontainers=2.3.0=pyhd8ed1ab_0
  - spectra=0.0.11=py_1
  - sqlite=3.34.0=h74cdb3f_0
  - tar=1.34=ha1f6473_0
  - tensorboard=1.14.0=py36_0
  - tensorflow=1.14.0=hc3e5e64_0
  - tensorflow-base=1.14.0=py36hc3e5e64_0
  - tensorflow-estimator=1.14.0=py36h5ca1d4c_0
  - termcolor=1.1.0=py_2
  - tk=8.6.10=h21135ba_1
  - toml=0.10.2=pyhd8ed1ab_0
  - tornado=6.1=py36h8f6f2f9_1
  - tqdm=4.57.0=pyhd8ed1ab_0
  - typing_extensions=3.7.4.3=py_0
  - urllib3=1.26.3=pyhd8ed1ab_0
  - werkzeug=1.0.1=pyh9f0ad1d_0
  - whatshap=0.18=py36h6bb024c_0
  - wheel=0.36.2=pyhd3deb0d_0
  - wrapt=1.12.1=py36h8f6f2f9_3
  - xopen=1.1.0=py36h5fab9bb_1
  - xz=5.2.5=h516909a_1
  - yaml=0.2.5=h516909a_0
  - zipp=3.4.0=py_0
  - zlib=1.2.11=h516909a_1010
  - zstd=1.4.8=ha95c52a_1

Thanks for the help in advance.

A fix for new artic's failure, medaka and h5py version not compatible

Hi @will-rowe ,

We have been observing errors in newest artic pipeline like below:

WARNING: Potential variant VCF contains contig b'MN908947.3' not found in BAM contigs.
error: Error reading potential variants VCF file.
caused by: Error accessing tid from chrom2tid data structure

We found another unsolved report issue about the same problem in artic community too #53

After checking the issue, we found the problem is due to incompatible version of h5py module for medaka. The h5py version (V.3.1.0 in our case) is not compatible with medaka V1.0.3, and it needs to change back to V.2.7.1 to fix the issue. If not the CHROM column in medaka output vcf will be b'MN908947.3', which would cause failures of pipeline.

pip install h5py==2.7.1

Hopefully it can help. And the best way to solve it might be to write the h5py version to environment file of artic.

Regards,
Shaokang

Unable to via conda or source, stall at setup.

I feel like it must be something really simple but i just cant get the pipeline to install, its either conflicting dependencies messages or errors such as these below. Sorry to bother!

CONDA Install error
(base) C:\Users\marcy>conda install -c bioconda artic
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: |
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

error from source when installing pysam

Installed c:\users\marcy\miniconda3\lib\site-packages\artic-1.2.1-py3.8.egg
Processing dependencies for artic==1.2.1
Searching for pysam
Reading https://pypi.org/simple/pysam/
Downloading https://files.pythonhosted.org/packages/99/5a/fc440eb5fffb5346e61a38b49991aa552e4b8b31e8493a101d2833ed1e19/pysam-0.16.0.1.tar.gz#sha256=d428a9768691d5ea3c28cc52a949c920ae691aa4c110a8b7328dc4d165ef1ad6
Best match: pysam 0.16.0.1
Processing pysam-0.16.0.1.tar.gz
Writing C:\Users\marcy\AppData\Local\Temp\easy_install-4f2jrqf_\pysam-0.16.0.1\setup.cfg
Running pysam-0.16.0.1\setup.py -q bdist_egg --dist-dir C:\Users\marcy\AppData\Local\Temp\easy_install-4f2jrqf_\pysam-0.16.0.1\egg-dist-tmp-qwc20qzo

pysam: no cython available - using pre-compiled C

pysam: htslib mode is shared

pysam: HTSLIB_CONFIGURE_OPTIONS=None

'.' is not recognized as an internal or external command,
operable program or batch file.
'.' is not recognized as an internal or external command,
operable program or batch file.

pysam: htslib configure options: None

error: [WinError 2] The system cannot find the file specified

guppyplex not found

Probably a stupid error of mine, I installed Artic from github a couple of weeks ago and try to run the tutorial from https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html on my ubuntu 16 server with guppy and ONT software present.

should I replace :
artic guppyplex --min-length 400 --max-length 700 --directory output_directory/barcode03 --prefix run_name

by some other command?

Is my conda env outdated?

Thanks

artic -h
usage: artic [-h] [-v]
             {extract,basecaller,demultiplex,minion,gather,filter,rampart,run}
             ...

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         Installed Artic version

[sub-commands]:
  {extract,basecaller,demultiplex,minion,gather,filter,rampart,run}
    extract             Create an empty poredb database
    basecaller          Display basecallers in files
    demultiplex         Run demultiplex
    minion              Run demultiplex
    gather              Gather up demultiplexed files
    filter              Filter FASTQ files by length
    rampart             Interactive prompts to start RAMPART
    run                 Process an entire run folder interactively
(artic-ncov2019) u0002316@gbw-s-pacbio01:/data/covid19/multiplex_run8_24_pool5

$ artic -v
artic 1.0.0

How do I adapt pipeline to add additional primer pairs to the ARTIC scheme (V3)?

I amplified SARS-CoV-2 positive samples with the ARTIC scheme V3.
Additionally I amplified three longer amplicons (2.5 kb) to cover regions of low coverage with the ARTIC scheme.
I tried to create a version of the primer schemes that include the primers for the long amplicons. However, they are filtered out in the align_trim step.

Using customized 'Primer' for ONT sequencing data

Hi,

I have a question about optimizing pipeline using 'customised primer'.
According to 'Run the MinION pipeline' step, what kind of data should I have to put my customized primer in order to replace 'nCoV-2019/V3' parameter.

Thank you in advance,
Chantisa

Nanopolish version doesn't exist on MacOS

We currently require nanopolish=0.12.5 but on MacOS the most recently available version is 0.11.3.

Unsure what the best solution is:

Mandate the the repo is linux only
Allow 0.11.3 (nanopolish=0.11.3|0.12.5)
Update the available MacOS version to 0.12.5 (in which case i'll make an issue there)

artic_plot_amplicon_depth crash in batch

I'm loving the amplicon coverage plots, great for debugging sequencing and assemblies.

However, I get this error when I run the pipeline in a non-interactive shell:

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Some issues with port forwarding?
Can I block the call to artic_plot_amplicon_depth

artic_plot_amplicon_depth cmd missing

Hi,
I had issues with artic_plot_amplicon_depth (AssertionError: error: readgroup not found in provided primer scheme (1)), I tried to get the new primer scheme with artic-tools. But I had the same error.

I was using artic 1.1.3, so I updated to artic 1.2.1, and now I can't find the plotting cmd anymore!

Is the artic_plot_amplicon_depth deprecated in the new artic version? Is there a new cmd to plot amplicons coverage?

Thank you in advance for the help,

Best
E.

Rampart & Barcoding in MinKNOW

We are about to start our 1st nCov run using the ARTIC protocol. We plan to use RAMPART to monitor the run. Do we need to have barcoding on or off in MinKNOW? If we have it on the fastq_pass folder will have sub folders for barcode01 barcode02 etc - will that still work with RAMPART?

Thanks!

Option to change minimum variant frequencies which pass filter

I am working with wastewater samples and would like to identify lower frequency variants, but the code which calls the nanopolish variant command is in the -artic=1.1.3 dependency of the environment and I'm not sure how to fork or edit that myself. Is this possible?
I found the line (170) in minion.py in fieldbioinformatics repository (the -m flag is what I would like to change), but not sure if that's compatible or up to date with the ncov2019 repository.
Thanks!

primer list not found

The link to get the primer list for sars-CoV-2 does not work
https://github.com/artic-network/artic-ncov2019/tree/master/primer_schemes/nCoV-2019/V3

nanopolish index still running after 24 hours

Hi all,

We are sequencing minION flowcells using the ARTIC amplicons and beginning to test out the ARTIC workflow as described on this page:

ARTIC SOP 2019

Our data are sequenced as a pool of 96 samples (including some negative and positive controls), basecalled with Guppy, and index split using the approach outlined in the SOP above.

Likely as expected, the guppyplex command is efficient and runs in just a minute or so on each barcode. This gives us 96 filtered fastq files with read counts ranging from 0 to 60,000.

When I run the minion command like this (one job per barcode):
artic minion --normalise 200 --threads 12 --scheme-directory ~/bin/artic-ncov2019/primer_schemes/ --read-file plex_FAP90847_AAGGTTACACAAACCCTGGACAAG_pass_concat.fastq --fast5-directory fast5 --sequencing-summary sequencing_summary.txt nCoV-2019/V3 plex_FAP90847_AAGGTTACACAAACCCTGGACAAG_pass_concat

It fires up the jobs without error, but after the first few minutes the output doesn't seem to be updated though the jobs don't crash or finish. I can see that the nanopolish index command is running and registering on the CPUs of my server, and there doesn't appear to be any iowat on the nodes, suggesting there isn't a network bottleneck slowing things down.

Within the first couple of minutes of the nanopolish index commands firing up I get files like these:

-rw-r--r-- 1 rcorbett users    0 Mar 13 13:08 plex_FAP90847_ACCACTGCCATGTATCAAAGTACG_pass_concat.minion.log.txt
-rw-r--r-- 1 rcorbett users 7954 Mar 13 13:08 plex_FAP90847_ACCACTGCCATGTATCAAAGTACG_pass_concat.fastq.index
-rw-r--r-- 1 rcorbett users    7 Mar 13 13:08 plex_FAP90847_ACCACTGCCATGTATCAAAGTACG_pass_concat.fastq.index.gzi
drwxr-sr-x 2 rcorbett users    4 Mar 13 13:08 .
-rw-r--r-- 1 rcorbett users 2962 Mar 13 13:08 plex_FAP90847_ACCACTGCCATGTATCAAAGTACG_pass_concat.fastq.index.fai

The jobs have run for another 24 hours without any more output going to the files. However, I see that nanopolish index has been running on my nodes and using CPU throughout.

Do you have any ideas how I should look in to what is slowing this down? I have samples that have been analyzing for over 24 hours - Is this expected? None of the samples that have gotten on the cluster have finished.

I have used nanopolish index on entire promethion flowcells outside of the ARTIC workflow and it runs in a few hours so I expected to be able to index the minIon data much faster. Perhaps running multiple samples in parallel is known to create issues with nanopolish index ?

thanks for all your insight!

nanopolish index - warning: detected invalid summary file entries

Hi,
I am working with a run that has 160 fast5 files. While running
artic minion --normalise 200 --threads 4 --scheme-directory ~/artic-ncov2019/primer_schemes --read-file run3_barcode11.fastq --fast5-directory /2020/May_2020/Run3_fast5 --sequencing-summary /2020/May_2020/Run3_fast5/sequencing_summary.txt nCoV-2019/V3 sample-name
I get the following warning
warning: detected invalid summary file entries for 36 of 82 fast5 files
These files will be indexed without using the summary file, which is slow.
I have two questions:

That folder actually contains 160 fast5 files, why are only 82 files being indexed
Will the indexing fail if i get this warning?

--skip-nanopolish and --dry-run

I am looking to get the pipeline up and running before we sequence data. I have downloaded some publicly available fastqs to try it out.
The medaka part works fine, however for the artic minion, I cannot provide the fast5 required, as I have not found a dataset.

So I tried a --dry-run like so:

artic minion --normalise 200 --threads 4 --scheme-directory ~/primer_schemes/ --read-file ~/reads/$sample.fastq --dry-run nCoV-2019/V3 $sample

Output: "Must specify FAST5 directory and sequencing summary for nanopolish mode."

Is it correct, that for artic minion, I always need a fast5? When I tried to --skip-nanopolish, I got the same output as above.

Are those two non-functioning flags or am I doing something wrong?
Thanks for advising!

Test Datasets

I just posted this same question to the fieldbioinformatics issues board as well, not sure which place is better to query!

Hello,

I was wondering if you knew of any good publicly available datasets for the V3 Artic Tiling Amplicon sequencing of hCoV-19. Ideally I would love to have a test dataset showing each of the variants of concern (UK, South Africa, and Brazil) along with the Wuhaun strain.

I've tried looking through GISAID and SRA, but as far as I can tell GISAID only supplies the preassembled genomes in a strange format and I need the raw fast5 or fastq files. And SRA is just very challenging to search to get exactly the type of library/sequence set you need that has enough metadata to inform the analysis.

I apologize if there are datasets somewhere already, but I'm somewhat frantically trying to figure out how to do this ARTIC analysis before teaching a course on it that begins on Monday. We sequenced synthetic salvia that contained RNA for the Wuhaun strain and three variants of concern, but for some reason the variant calling is coming up with nothing and I'm struggling to figure out if it's our data or if it's something going wrong in the pipeline.

I would sincerely appreciate any help or a pointer to appropriate public datasets that have done the ARTIC V3 tiling amplicon approach and have worked with the standard SOP bioinformatics pipeline.

Sincerely,
Katie

bcftools overlapping variants issue

Hey there.

Getting an error on the last step in the medaka workflow

bcftools consensus -f PQPR059970.preconsensus.fasta PQPR059970.pass.vcf.gz -m PQPR059970.coverage_mask.txt -o PQPR059970.consensus.fasta

It gives me the following

Note: the --sample option not given, applying all records regardless of the genotype
The fasta sequence does not match the REF allele at MN908947.3:6312:
   .vcf: [C]
   .vcf: [A] <- (ALT)
   .fa:  [N]AAAACCAGTTGAAACATCAAAT.......rest of the reference........

This seems to relate to these 2 issues from bcftools
samtools/bcftools#961
samtools/bcftools#600

That they fixed using the latest github builds, rather than the 1.9 release.

I'm in the middle of compiling bcftools from source to give this a test to see if it fixes it.

I'll let you know how that goes.

Cheers
James

Missing license in git repository

I'm not sure if it is missing for a reason but the SOP suggests a Creative Commons Attribution 4.0 International License. If that is the same license for the git repository it would be nice to have that in the repository. If its helpful, I made a pull request #8 with the CC-BY-4.0 legal text.

Add reference.gff for use by ivar

A GFF file containing the open reading frames (ORFs) has not been provided. Amino acid translation will not be done.

having that file already ready for SARS-COV-2 ARTIC folder would be handy

Invalid BED file column 5

Column 5 of the following BED files are not valid: nCoV-2019.scheme.bed and nCoV-2019.bed. Column 5 is supposed to be an integer score in the interval [0,1000]. As such, doing things like converting the file to BigBed format (using bedToBigBed) for visualisation in genome browsers fails for such files.

Medaka enabled pipeline exits with longshot failure

Hi Will/Nick!

I'm having an issue similar to this one.

I recently updated artic-ncov2019 from v.1.0.0 to v.1.1.3. Now when I run:

artic minion --medaka --normalise 200 --threads 72 --scheme-directory ~/artic-ncov2019/primer_schemes --read-file simulated_reads.tgz nCoV-2019/V3 test

The pipeline exits with the error:

Command failed:longshot -P 0 -F -A --no_haps --bam test.primertrimmed.rg.sorted.bam --ref ~/artic-ncov2019/primer_schemes/nCoV-2019/V3/nCoV-2019.reference.fasta --out test.longshot.vcf --potential_variants test.merged.vcf.gz

There also seems to be an upstream error:

WARNING: Potential variant VCF contains contig b'MN908947.3' not found in BAM contigs.
error: Error reading potential variants VCF file.
caused by: Error accessing tid from chrom2tid data structure

My env is loaded with longshot=0.4.1., which was the dependency for artic-ncov2019/environment-medaka.yml in v.1.0.0. However, I see several medaka-related dependencies in the new artic-ncov2019/environment.yml are no longer listed. I'm unsure how to troubleshoot this one further. I have attached the full output log and package list:
packages.txt
log.txt

RAMPART version 1.0

Hey,

RAMPART version 1.0 can't do remote browser.
Can we get the yml file updated to run version 1.1.0?
I've got RAMPART running in it's own env from that repo, running on the protocols from this repo, and it's working really well. I just remove the RAMPART line from the setup yml file.

Cheers,

reference.fasta

I'm running into this issue for the first time and I'm not sure how to address it:

[FATAL] Unable to find required file, references.fasta, for pipeline, 'Annotate reads'

Any suggestions?

ambiguity sequences in assembly

I know artic can generate "N" as ambiguity sequences. But will artic pipeline generate other IUPAC ambiguity codes in assembly? I guess the answer is no, right? Thanks for any help.

NameError: name 'barcode_directory' is not defined

Hi,
I tried the new version of the pipeline on SP1 data, as I saw there had been some updates.
However, the first command for artic guppyplex:
artic guppyplex --min-length 400 --max-length 700 --directory basecalling/ --prefix SP1
Fails with error:

Traceback (most recent call last):
  File "/home/simone/miniconda3/envs/artic-ncov2019/bin/artic", line 8, in <module>
    sys.exit(main())
  File "/home/simone/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/pipeline.py", line 216, in main
    args.func(parser, args)
  File "/home/simone/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/pipeline.py", line 35, in run_subtool
    submodule.run(parser, args)
  File "/home/simone/miniconda3/envs/artic-ncov2019/lib/python3.6/site-packages/artic/guppyplex.py", line 36, in run
    fastq_outfn = "%s_%s.fastq" % (args.prefix, os.path.basename(barcode_directory))
NameError: name 'barcode_directory' is not defined

Are there any issues in the command I am using? Thanks in advance,
Simone

Primer names (alt suffix)

Hello
I'm a little confused about the difference between the primers with the same number but with the "alt" suffix. Should i use these alt primers instead of the original one? (nCoV-2019_89_LEFT_alt2 instead of nCoV-2019_89_LEFT, for example).

Thank you

Leonardo