Git Product home page Git Product logo

Comments (1)

moonwatcher avatar moonwatcher commented on May 28, 2024

Hi Adrian

Thank you for considering Pheniqs for your project and sorry for the late reply, I was on a bonding leave with my firstborn.

Your config file had a few mistakes but I think the main issue is that the sample decoder default to the passthrough algorithm (just spitting back out the reads as they came in, which is used for quickly repackaging reads from one layout to another). If you specify "algorithm": "pamld" you will get the expected output.

Your transform directive was malformed. "transform": ["0:1:5"], should be "transform": { "token": [ "0:1:5" ] },. The validator actually did the right thing here and gave the following error message:

JSON directive validation error
Error description: Expected type object but actual is array
Path in document: /sample/transform
Document URL: testSample.json?format=json

You also did not specify an output template directive so the entire read will be emitted on output reads. you probably wanted to trim the actual decoded barcode from output, in which case you should add something like "template": { "transform": { "token": [ "0:1:5" ] } }.

Notice that if you execute with --validate pheniqs will tell you exactly what it plans to do, this often helps debugging configuration "misunderstandings". so for instance after correcting your config to

{
    "input": [
        "test.fastq"
    ],
    "sample": {
        "transform": { "token": [ "0:1:5" ] },
        "codec": {
            "@FirstCode": { "barcode": [ "AAAA" ] },
            "@SecondCode": { "barcode": [ "CCCC" ] }
        }
    },
    "template": { "transform": { "token": [ "0:1:5" ] } }
}

executing pheniqs mux --config testSample.json --validate returns this report and you can see the sample decoding algorithm is passthrough.

Environment

    Base input URL                              /Users/lg/Desktop/ticket 34/my-example
    Base output URL                             /Users/lg/Desktop/ticket 34/my-example
    Platform                                    ILLUMINA
    Quality tracking                            disabled
    Filter incoming QC failed reads             disabled
    Filter outgoing QC failed reads             disabled
    Input Phred offset                          33
    Output Phred offset                         33
    Leading segment index                       0
    Default output format                       sam
    Default output compression                  unknown
    Default output compression level            5
    Feed buffer capacity                        2048
    Threads                                     8
    Decoding threads                            1
    HTSLib threads                              8

Input

    Input segment cardinality                   1

    Input segment No.0 : /Users/lg/Desktop/ticket 34/my-example/test.fastq?format=fastq

    Input feed No.0
        Type : fastq
        Compression : unknown
        Resolution : 1
        Phred offset : 33
        Platform : ILLUMINA
        Buffer capacity : 2048
        URL : /Users/lg/Desktop/ticket 34/my-example/test.fastq?format=fastq

Output transform

    Output segment cardinality                  1

    Token No.0
        Length        4
        Pattern       0:1:5
        Description   cycles 1 to 5 of input segment 0

    Assembly instruction
        Append token 0 of input segment 0 to output segment 0

Sample decoding

    Decoding algorithm                          passthrough
    Shannon bound                               1
    Segment cardinality                         1
    Nucleotide cardinality                      4

    Transform

        Token No.0
            Length        4
            Pattern       0:1:5
            Description   cycles 1 to 5 of input segment 0

        Assembly instruction
            Append token 0 of input segment 0 to output segment 0


    Barcode undetermined
        ID : undetermined
        PU : undetermined
        Segment No.0  : /dev/stdout?format=sam&compression=none

    Barcode @FirstCode
        ID : AAAA
        PU : AAAA
        Concentration : 0.495
        Barcode       : AAAA
        Segment No.0  : /dev/stdout?format=sam&compression=none

    Barcode @SecondCode
        ID : CCCC
        PU : CCCC
        Concentration : 0.495
        Barcode       : CCCC
        Segment No.0  : /dev/stdout?format=sam&compression=none

    Output feed No.0
        Type : sam
        Resolution : 1
        Phred offset : 33
        Platform : ILLUMINA
        Buffer capacity : 2048
        URL : /dev/stdout?format=sam&compression=none

So the final config I suggest is

{
    "input": [
        "test.fastq"
    ],
    "sample": {
        "algorithm": "pamld",
        "transform": { "token": [ "0:1:5" ] },
        "codec": {
            "@FirstCode": { "barcode": [ "AAAA" ] },
            "@SecondCode": { "barcode": [ "CCCC" ] }
        }
    },
    "template": { "transform": { "token": [ "0:1:5" ] } }
}

which yields the expected output:

@HD	VN:1.0	SO:unknown	GO:query
@RG	ID:undetermined	PU:undetermined
@RG	ID:AAAA	BC:AAAA	PU:AAAA
@RG	ID:CCCC	BC:CCCC	PU:CCCC
@PG	ID:pheniqs	PN:pheniqs	CL:pheniqs mux --config testSample.json	VN:2.1.0-37-g684f02b7b3bfaec7040337884b7f13ed6eb3fd58
IDENAIFIER	76	*	0	0	*	*	0	0	AAAA	CCCC	RG:Z:AAAA	BC:Z:AAAA	QT:Z:CCCC	XB:f:7.90337e-05
IDENAIFIER	76	*	0	0	*	*	0	0	CCCC	CCCC	RG:Z:CCCC	BC:Z:CCCC	QT:Z:CCCC	XB:f:7.90337e-05
{
    "incoming": {
        "count": 2,
        "pf count": 2,
        "pf fraction": 1.0
    },
    "outgoing": {
        "count": 2,
        "pf count": 2,
        "pf fraction": 1.0
    },
    "sample": {
        "average classified confidence": 0.999920966315065,
        "average pf classified confidence": 0.999920966315065,
        "classified": [
            {
                "BC": "AAAA",
                "ID": "AAAA",
                "PU": "AAAA",
                "average confidence": 0.999920966315065,
                "average pf confidence": 0.999920966315065,
                "barcode": [
                    "AAAA"
                ],
                "concentration": 0.495,
                "count": 1,
                "estimated concentration": 0.5,
                "index": 1,
                "pf count": 1,
                "pf fraction": 1.0,
                "pf pooled classified fraction": 0.5,
                "pf pooled fraction": 0.5,
                "pooled classified fraction": 0.5,
                "pooled fraction": 0.5
            },
            {
                "BC": "CCCC",
                "ID": "CCCC",
                "PU": "CCCC",
                "average confidence": 0.999920966315065,
                "average pf confidence": 0.999920966315065,
                "barcode": [
                    "CCCC"
                ],
                "concentration": 0.495,
                "count": 1,
                "estimated concentration": 0.5,
                "index": 2,
                "pf count": 1,
                "pf fraction": 1.0,
                "pf pooled classified fraction": 0.5,
                "pf pooled fraction": 0.5,
                "pooled classified fraction": 0.5,
                "pooled fraction": 0.5
            }
        ],
        "classified count": 2,
        "classified fraction": 1.0,
        "classified pf fraction": 1.0,
        "count": 2,
        "index": 0,
        "pf classified count": 2,
        "pf classified fraction": 1.0,
        "pf count": 2,
        "pf fraction": 1.0,
        "unclassified": {
            "ID": "undetermined",
            "PU": "undetermined",
            "count": 0,
            "index": 0,
            "pf count": 0,
            "pf fraction": 0.0,
            "pf pooled fraction": 0.0,
            "pooled fraction": 0.0
        }
    }
}

Please let us know if there are any other issues you are experiencing.

Regards,

L.

from pheniqs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.