Git Product home page Git Product logo

Comments (5)

MikkelSchubert avatar MikkelSchubert commented on August 23, 2024 1

Hi,

I'm afraid that I'll have to ask you to clarify what you mean by internal barcodes in this context, as I am a bit rusty on the terminology.

Cheers

from adapterremoval.

jfy133 avatar jfy133 commented on August 23, 2024 1

Hi Mikkel,

To be able to measure barcode hopping on some machines, people have started ligating very short (~6-7bp) 'barcodes' directly onto the extracted DNA molecules, prior to adapter+index ligation.

image

Figure 1 of https://www.biorxiv.org/content/10.1101/179028v3.full.pdf

So in principle what this request would involve would be

  1. the initial removal of adapters,
  2. new a second pass of removal, to remove a second user-specified sequence.

As far as I know people typically only use a single barcode per sample out of a pool of maybe 12 barcodes. I guess if a user specifies these as a list (like with --adapter-list), this would be sufficient.

I guess in principle one could use the --identify-adapters functionality, but this doesn't actually do the trimming, and also the user should already know the actual barcode so for 'precision' it would make sense they can specifically define that.

Let me know if this is not clear...

Edit: to clarify as the barcodes are sample specific, you would have to allow the user to specify this as a list of possible barcode, in pipeline contexts (such as eager).

from adapterremoval.

apeltzer avatar apeltzer commented on August 23, 2024

Hi Mikkel!

I've asked the requester(s) to provide some insights for this :-)

from adapterremoval.

MikkelSchubert avatar MikkelSchubert commented on August 23, 2024

Thank you for the detailed explanation!

Unless I am misunderstanding something, then barcodes of this type are already supported via the demultiplexing functionality. This is enabled when the user provides a table of sample names and barcodes with the --barcode-list option, such as these:

sample_1 ATGCGGA TGAATCT
sample_2 ATGGATT ATAGTGA
sample_7 CAAAACT TCGCTGC

The first column is used in output filenames, the second specifies the P7 barcode, and the third (optional) column specifies the P5 barcode. AdapterRemoval uses the barcodes to map reads/read pairs to samples, at which point the barcodes are removed from the 5' of each read. After that, adapter trimming is carried out using per-sample query sequences generated by merging the opposing barcode with the adapter sequence, so that both are trimmed from the reads.

There's a small example in the examples folder that you can run with

AdapterRemoval --file1 demux_1.fq --file2 demux_2.fq --basename output_demux --barcode-list barcodes.txt

It is also possible to just do the demultiplexing, if you want to do adapter trimming with a different trimmer. The combined barcode+adapter sequences are listed in the resulting settings files for each sample.

If I recall correctly, then it is currently possible to demultiplex using P7 barcodes or using P7 + P5 barcodes, but not P5 barcodes by themselves.

See here for more information:
https://adapterremoval.readthedocs.io/en/latest/examples.html#demultiplexing-and-adapter-trimming

from adapterremoval.

jfy133 avatar jfy133 commented on August 23, 2024

Hi @MikkelSchubert ,

Ok, that does indeed sound like a possibility! I will investigate and see if we can get it to work as expected by the people who requested it, otherwise we will come back to you.

Cheers,

from adapterremoval.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.