Comments (12)
Re first point, it seems to me that we can do exactly the same as this:
except rather than doing matrix multiplication (e.g. in A_0 \cdot T) we take the max (rather than sum) of the component-wise products. We also record this index.
We also need to have someplace to store this information. We could have another Smooshish type that holds this information, does not have viterbi_
be the same as marginal_
as it is now.
I don't understand your second point-- the Viterbi path for the whole thing is the best path among those for the various chains, right?
from linearham.
Agree to the first point! Maybe even store it in NTInsertion class, but never initialize it if we want the marginal prob?
To your second comment, yes. But if we have tons of chains due to there being tons of SW matches, then doing a final max over all those wont be cheap? Its a minor thing if we have few matches per read (as our example files have). The sketch i have for how to do proper viterbi is not clean, so practically if you feel the number of SW matches is usually small most times then it shouldn't hurt to do that final max.
from linearham.
(Personally I'd rather not change so much of what we've already done for a proper viterbi if the gains aren't so high)
from linearham.
Agree to the first point! Maybe even store it in NTInsertion class, but never initialize it if we want the marginal prob?
Yep!
For your second point, I'm not sure what you are proposing as an alternate means of doing viterbi.
from linearham.
It would involve keeping track of not only max transition points but also max germline genes. Right now we only do it over transition points.
from linearham.
It seems to me that we can't get any more efficient than using the Chain structure, right? And at the end of the Chain inferences we will have the Viteri probabilities. From there it's just taking a max over a vector.
(Pretty sure I'm missing your point here...)
from linearham.
Its just that the max over all VDJ chains could be expensive because its max over (#V) * (#D) * (#J) elements, which could be costly depending on how many there are of V's, D's, J's, yeah?
from linearham.
Yes, but are you proposing a means of getting around that?
from linearham.
yes, but A) you seem ok with max'ing over all VDJ chains? and B) it might drastically change the code structure so do we care enough to do it?
from linearham.
You really have me curious, but I can guess what you're proposing by your comments above. It seems like for this round of coding we can focus on getting the existing branch merged and closing the issues as they stand, which AFAIK just requires more tests.
Then we can consider optimizing both the marginal and the viterbi calculations with more specialized code?
from linearham.
(BTW, partis will only serve I think 3 V's 5 D's and 4 J's, which is 60 combinations max)
from linearham.
we are going bayesian! viterbi unnecessary!
from linearham.
Related Issues (20)
- remove dockerhub image HOT 1
- git describe crashes build HOT 1
- Do we really want to use scons? HOT 3
- NodeList object where it expected a string HOT 1
- change actions to positional argument
- make default tag on quay.io HOT 1
- Use new partis fcn add_seqs_to_line() HOT 4
- Cause of huge number of low-probability nodes HOT 6
- update readme with newer (?) output files HOT 2
- update partis + maybe add --min-cluster-size arg HOT 1
- Docker container doesn't build HOT 2
- light chain crash HOT 3
- Github Actions & Quay Container building HOT 2
- Update Master -> Main
- crash in asr/ess script HOT 7
- Inferred ancestral sequences have mutations in ambiguous regions
- Branch lengths almost all much larger than hamming distance
- Add option for less verbose output dir HOT 1
- Failed Quay build from trigger
- How to handle paired h/l data
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from linearham.