So I have a new idea for identifying the parent lineage.
England/MILK-3796834/2022
is an XM
recombinant, with regions predicted by sc2rf
:
44:17410|Omicron/21K
21618:29510|Omicron/21L
From nextclade
, the mutations by region are as follows:
44:17410
C241T,C2470T,A2832G,C3037T,T5386G,G8393A,C10029T,C10449A,A11537G,C12513T,T13195C,C14408T,C15240T
21618:29510
C21618T,G21987A,T22200G,G22578A,C22674T,T22679C,C22686T,A22688G,G22775A,A22786C,G22813T,T22882G,G22992A,C22995A,A23013C,A23040G,A23055G,A23063T,T23075C,A23403G,C23525T,T23599G,C23604A,C23854A,G23948T,A24424T,T24469A,C25000T,C25416T,C25584T,C26060T,C26270T,C26577G,G26709A,C26858T,A27259C,G27382C,A27383T,T27384C,C27807T,A28271T,C28311T,G28487A,G28881A,G28882A,G28883C,A29510C
And if we query these mutations in cov-spectrum with LAPIS...
Parent 1 | 44:17410
Parent 1 is mostly likely BA.1.1.10
(72%, 647/894.)
https://lapis.cov-spectrum.org/open/v1/sample/aggregated?fields=pangoLineage&nucMutations=C241T,C2470T,A2832G,C3037T,T5386G,G8393A,C10029T,C10449A,A11537G,C12513T,T13195C,C14408T,C15240T
{
"errors":[],
"info": {
"apiVersion":1,
"dataVersion":1656461191,
"deprecationDate":null,
"deprecationInfo":null,
"acknowledgement":null
},
"data":[
{"pangoLineage":"B.1.1","count":1},
{"pangoLineage":"BA.1.1.18","count":30},
{"pangoLineage":"BA.1.1.12","count":3},
{"pangoLineage":"BA.1.1.10","count":647},
{"pangoLineage":"BA.1.1","count":186},
{"pangoLineage":"BA.1","count":27}
]
}
Parent 2 | 21618:29510
Parent 2 is mostly likely BA.2
(83%, 38/46). However, there is only one runner up, and it is BA.2.12.1
(17%) which falls within BA.2
.
https://lapis.cov-spectrum.org/open/v1/sample/aggregated?fields=pangoLineage&nucMutations=C21618T,G21987A,T22200G,G22578A,C22674T,T22679C,C22686T,A22688G,G22775A,A22786C,G22813T,T22882G,G22992A,C22995A,A23013C,A23040G,A23055G,A23063T,T23075C,A23403G,C23525T,T23599G,C23604A,C23854A,G23948T,A24424T,T24469A,C25000T,C25416T,C25584T,C26060T,C26270T,C26577G,G26709A,C26858T,A27259C,G27382C,A27383T,T27384C,C27807T,A28271T,C28311T,G28487A,G28881A,G28882A,G28883C,A29510C
{
"errors": [],
"info":{
"apiVersion":1,
"dataVersion":1656461191,
"deprecationDate":null,
"deprecationInfo":null,
"acknowledgement":null
},
"data":[
{"pangoLineage":"BA.2","count":38},
{"pangoLineage":"BA.2.12.1","count":8}
]
}
Resolving
There are a couple of options to resolve the proportions:
- Exclude lineages by a hard cut-off (<1%, <10%, etc.)
- Take the highest proportion lineage.
- Consider lineages in descending order, and report a lineage if it is a sub-lineage of the one with the highest proportion.
Lineage |
Count |
Proportion |
Note |
BA.1.1.10 |
647 |
72% |
Report |
BA.1.1 |
186 |
21% |
Not sub-lineage |
BA.1.1.18 |
30 |
3% |
Not sub-lineage |
BA.1 |
27 |
3% |
Not sub-lineage |
BA.1.1.12 |
3 |
<1% |
Exclude |
B.1.1 |
1 |
<1% |
Exclude |
Lineage |
Count |
Proportion |
Note |
BA.2 |
38 |
83% |
Ignore, has sub-lineage |
BA.2.12.1 |
8 |
17% |
Report, is sub-lineage |