Comments (1)
Hi Adrian
Thank you for considering Pheniqs for your project and sorry for the late reply, I was on a bonding leave with my firstborn.
Your config file had a few mistakes but I think the main issue is that the sample decoder default to the passthrough
algorithm (just spitting back out the reads as they came in, which is used for quickly repackaging reads from one layout to another). If you specify "algorithm": "pamld"
you will get the expected output.
Your transform
directive was malformed. "transform": ["0:1:5"],
should be "transform": { "token": [ "0:1:5" ] },
. The validator actually did the right thing here and gave the following error message:
JSON directive validation error
Error description: Expected type object but actual is array
Path in document: /sample/transform
Document URL: testSample.json?format=json
You also did not specify an output template directive so the entire read will be emitted on output reads. you probably wanted to trim the actual decoded barcode from output, in which case you should add something like "template": { "transform": { "token": [ "0:1:5" ] } }
.
Notice that if you execute with --validate pheniqs will tell you exactly what it plans to do, this often helps debugging configuration "misunderstandings". so for instance after correcting your config to
{
"input": [
"test.fastq"
],
"sample": {
"transform": { "token": [ "0:1:5" ] },
"codec": {
"@FirstCode": { "barcode": [ "AAAA" ] },
"@SecondCode": { "barcode": [ "CCCC" ] }
}
},
"template": { "transform": { "token": [ "0:1:5" ] } }
}
executing pheniqs mux --config testSample.json --validate
returns this report and you can see the sample decoding algorithm is passthrough
.
Environment
Base input URL /Users/lg/Desktop/ticket 34/my-example
Base output URL /Users/lg/Desktop/ticket 34/my-example
Platform ILLUMINA
Quality tracking disabled
Filter incoming QC failed reads disabled
Filter outgoing QC failed reads disabled
Input Phred offset 33
Output Phred offset 33
Leading segment index 0
Default output format sam
Default output compression unknown
Default output compression level 5
Feed buffer capacity 2048
Threads 8
Decoding threads 1
HTSLib threads 8
Input
Input segment cardinality 1
Input segment No.0 : /Users/lg/Desktop/ticket 34/my-example/test.fastq?format=fastq
Input feed No.0
Type : fastq
Compression : unknown
Resolution : 1
Phred offset : 33
Platform : ILLUMINA
Buffer capacity : 2048
URL : /Users/lg/Desktop/ticket 34/my-example/test.fastq?format=fastq
Output transform
Output segment cardinality 1
Token No.0
Length 4
Pattern 0:1:5
Description cycles 1 to 5 of input segment 0
Assembly instruction
Append token 0 of input segment 0 to output segment 0
Sample decoding
Decoding algorithm passthrough
Shannon bound 1
Segment cardinality 1
Nucleotide cardinality 4
Transform
Token No.0
Length 4
Pattern 0:1:5
Description cycles 1 to 5 of input segment 0
Assembly instruction
Append token 0 of input segment 0 to output segment 0
Barcode undetermined
ID : undetermined
PU : undetermined
Segment No.0 : /dev/stdout?format=sam&compression=none
Barcode @FirstCode
ID : AAAA
PU : AAAA
Concentration : 0.495
Barcode : AAAA
Segment No.0 : /dev/stdout?format=sam&compression=none
Barcode @SecondCode
ID : CCCC
PU : CCCC
Concentration : 0.495
Barcode : CCCC
Segment No.0 : /dev/stdout?format=sam&compression=none
Output feed No.0
Type : sam
Resolution : 1
Phred offset : 33
Platform : ILLUMINA
Buffer capacity : 2048
URL : /dev/stdout?format=sam&compression=none
So the final config I suggest is
{
"input": [
"test.fastq"
],
"sample": {
"algorithm": "pamld",
"transform": { "token": [ "0:1:5" ] },
"codec": {
"@FirstCode": { "barcode": [ "AAAA" ] },
"@SecondCode": { "barcode": [ "CCCC" ] }
}
},
"template": { "transform": { "token": [ "0:1:5" ] } }
}
which yields the expected output:
@HD VN:1.0 SO:unknown GO:query
@RG ID:undetermined PU:undetermined
@RG ID:AAAA BC:AAAA PU:AAAA
@RG ID:CCCC BC:CCCC PU:CCCC
@PG ID:pheniqs PN:pheniqs CL:pheniqs mux --config testSample.json VN:2.1.0-37-g684f02b7b3bfaec7040337884b7f13ed6eb3fd58
IDENAIFIER 76 * 0 0 * * 0 0 AAAA CCCC RG:Z:AAAA BC:Z:AAAA QT:Z:CCCC XB:f:7.90337e-05
IDENAIFIER 76 * 0 0 * * 0 0 CCCC CCCC RG:Z:CCCC BC:Z:CCCC QT:Z:CCCC XB:f:7.90337e-05
{
"incoming": {
"count": 2,
"pf count": 2,
"pf fraction": 1.0
},
"outgoing": {
"count": 2,
"pf count": 2,
"pf fraction": 1.0
},
"sample": {
"average classified confidence": 0.999920966315065,
"average pf classified confidence": 0.999920966315065,
"classified": [
{
"BC": "AAAA",
"ID": "AAAA",
"PU": "AAAA",
"average confidence": 0.999920966315065,
"average pf confidence": 0.999920966315065,
"barcode": [
"AAAA"
],
"concentration": 0.495,
"count": 1,
"estimated concentration": 0.5,
"index": 1,
"pf count": 1,
"pf fraction": 1.0,
"pf pooled classified fraction": 0.5,
"pf pooled fraction": 0.5,
"pooled classified fraction": 0.5,
"pooled fraction": 0.5
},
{
"BC": "CCCC",
"ID": "CCCC",
"PU": "CCCC",
"average confidence": 0.999920966315065,
"average pf confidence": 0.999920966315065,
"barcode": [
"CCCC"
],
"concentration": 0.495,
"count": 1,
"estimated concentration": 0.5,
"index": 2,
"pf count": 1,
"pf fraction": 1.0,
"pf pooled classified fraction": 0.5,
"pf pooled fraction": 0.5,
"pooled classified fraction": 0.5,
"pooled fraction": 0.5
}
],
"classified count": 2,
"classified fraction": 1.0,
"classified pf fraction": 1.0,
"count": 2,
"index": 0,
"pf classified count": 2,
"pf classified fraction": 1.0,
"pf count": 2,
"pf fraction": 1.0,
"unclassified": {
"ID": "undetermined",
"PU": "undetermined",
"count": 0,
"index": 0,
"pf count": 0,
"pf fraction": 0.0,
"pf pooled fraction": 0.0,
"pooled fraction": 0.0
}
}
}
Please let us know if there are any other issues you are experiencing.
Regards,
L.
from pheniqs.
Related Issues (20)
- Install failure with pheniqs-tools (ppkg.py) HOT 2
- error while installing pheniqs under centos 6 using ppkg.py HOT 5
- Pheniqs only processes a small fraction of reads HOT 21
- --help bug HOT 2
- Desirable future features
- EOF error HOT 3
- Citing Pheniqs HOT 3
- Trouble replicating basic behavior HOT 3
- Troubleshooting "SequenceError" error HOT 1
- output knitted and corrected barcodes to fastq HOT 7
- demultiplexing based on primer HOT 7
- Help understanding json config for basic demultiplexing HOT 2
- Last record missing in barcode corrected BAM file HOT 8
- Quadruple indexing, variable index length HOT 6
- IO error HOT 1
- Tutorial info not correct? HOT 1
- demultiplexing by multiple barcode positions HOT 1
- Configuration error: leading segment index 1 references non existing input segment HOT 2
- Incorrect urls in 'Getting Started' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pheniqs.