Git Product home page Git Product logo

Comments (13)

AndreasAZiegler avatar AndreasAZiegler commented on June 12, 2024 1

Thanks a lot for the explanation. Your way of explaining it is way better than in the textbook, I have.

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

Hi,

In your case, the emission matrix O gives the probabilities P(Y_t = k | Z_t = i) = O_{ik}.
To model that using Distributions.jl you can use a Categorical distribution (one for each state).
It is parametrized by a probability vector where element i is the probability of symbol i.

See the examples below.
Hope this helps!

Maxime

julia> using Distributions

help?> Categorical
search: Categorical ncategories

  Categorical(p)

  A Categorical distribution is parameterized by a probability vector p (of length K).

P(X = k) = p[k]  \quad \text{for } k = 1, 2, \ldots, K.
using Distributions
using HMMBase

# Initial Probabilities
a = [0.3, 0.2, 0.5]

# Transition Matrix
A = [0 0.2 0.8; 1 0 0; 0.4 0.4 0.2]

# Observations Distributions (for each state)
B1 = Categorical([0, 0.5, 0.5])
B2 = Categorical([1, 0, 0])
B3 = Categorical([0.1, 0.2, 0.7])

# Example: probability of observing O3 in state 1
@show pdf(B1, 3)
# pdf(B1, 3) = 0.5

# Build the HMM
hmm = HMM(a, A, [B1, B2, B3])

# Observations
Yt = [1, 3, 1]

# Question c)
# `loglikelihood` uses the forward algorithm to compute the sequence probability.
@show exp(loglikelihood(hmm, Yt))
# exp(loglikelihood(hmm, Yt)) = 0.03374000000000003

# Question d)
@show viterbi(hmm, Yt)
# viterbi(hmm, Yt) = [2, 1, 2]

from hmmbase.jl.

AndreasAZiegler avatar AndreasAZiegler commented on June 12, 2024

Thanks a lot for your quick response and the example. It would have taken me hours to figure this out on my own.

The output of the viterbi algorithms comes to the same results as I came by hand calculation (the solution got the second state wrong).

However, by using forward, backward and (log)likelihood(s) I don't manage to come to the same numbers as I got with my hand calculation (here I think the solutions did not make any mistake). Can you guess where the difference comes from?

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

Sorry I had not seen the correction in the pdf.

In my example I used the transition matrix provided in question a) without verifying.
But indeed it does not correspond to the transition diagram (P_{12} should be 0.8 not 0.2, there should be no self-transition in S3, etc.).

If we use the transition matrix that correspond to the diagram (the one provided in the correction), we get the right results :)

I will also document this use case (HMM from emission matrix) to make it easier for the users.

using Distributions
using HMMBase

# Initial Probabilities
a = [0.3, 0.2, 0.5]

# Transition Matrix
A = [0 0.8 0.2; 0.6 0 0.4; 0.0 1.0 0.0]
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

# Observations Distributions (for each state)
B1 = Categorical([0, 0.5, 0.5])
B2 = Categorical([1, 0, 0])
B3 = Categorical([0.1, 0.2, 0.7])

# Build the HMM
hmm = HMM(a, A, [B1, B2, B3])

# Observations
Yt = [1, 3, 1]

# Question c)
@show exp(loglikelihood(hmm, Yt))
# exp(loglikelihood(hmm, Yt)) = 0.10519999999999999 ≈ 0.1052

# Question d)
@show viterbi(hmm, Yt)
# viterbi(hmm, Yt) = [2, 3, 2]

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

Also, if you get different results in the forward/backward algorithm, this is because in HMMBase (and most forward/backward implementations), we scale the values so that they sum to 1 (for numerical stability).

forward(hmm, Yt)[1]
# 3×3 Array{Float64,2}:
#  0.0       0.8       0.2
#  0.517241  0.0       0.482759
#  0.0       0.988593  0.0114068
# Correction is unscaled. HMMBase is scaled.
# t = 1
# Unscaled: 0 0.2      0.05      (sum = 0.25)
# Scaled:   0 0.2/0.25 0.05/0.25
#      =    0 0.8      0.2

# t = 2
# Unscaled: 0.06       0 0.056      (sum = 0.116)
# Scaled:   0.06/0.116 0 0.056/0.116
#      =    0.517241   0 0.482759
# ...

from hmmbase.jl.

AndreasAZiegler avatar AndreasAZiegler commented on June 12, 2024

Thanks again for the thorough explanation. I also read that scaling is important to avoid numerical issues especially in larger HMM. As you state that HMMBase also scales, I assume there is no way to recover the unscaled values, or is there?

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

There is currently no way.
I'll try to add this possibility in a future version, at-least for educational purpose.

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

I've added a new constructor so that you can build an HMM directly from an emission matrix (instead of building categorical distributions by hand).

Using your example:

a = [0.3, 0.2, 0.5]
A = [0 0.8 0.2; 0.6 0 0.4; 0.0 1.0 0.0]
B = [0 0.5 0.5; 1.0 0 0; 0.1 0.2 0.7]
hmm = HMM(a, A, B)

from hmmbase.jl.

AndreasAZiegler avatar AndreasAZiegler commented on June 12, 2024

Wow, pretty cool. That definitively saves some typing.

from hmmbase.jl.

AndreasAZiegler avatar AndreasAZiegler commented on June 12, 2024

@maxmouchet I calculated the viterbi calculations by hand and don't get the same at t=2 as your implementation. The implementation takes state 3 with p=0.056 over state 1 with p=0.06. Is there some numerical issue or do i misunderstand the viterbi algorithm?

equation

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

In the Viterbi algorithm the state sequence is inferred backwards, starting from the last state.
Your computation for V_1 is correct but for V_2 you also need to record the ancestors.
That is, argmax_i P_{ij} V_1(i), or more generally, A[t,j] = argmax_i P_{ij} V_{t-1}(i).

For V_2(1), P_{i1} V_1(i) is maximized when i = 2, so the ancestor is 2.
For V_2(2), P_{i2} V_1(i) is maximized when i = 3, so the ancestor is 3.
For V_2(3), P_{i3} V_1(i) is maximized when i = 2, so the ancestor is 2.
And so on for V_3(.).

At the end you should get the following ancestors:

A[2,1] = 2
A[2,2] = 3
A[2,3] = 2
A[3,1] = _
A[3,2] = 3
A[3,3] = 2

Finally, to find the state sequence, we start from z_T = argmax_j V_T(j), and z_t = A[t+1, z_{t+1}].
Here z_3 = 2, z_2 = A[3,2] = 3 and z_1 = A[2,3] = 2.

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

A good introduction to HMMs is https://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/tutorial%20on%20hmm%20and%20applications.pdf.

from hmmbase.jl.

maxmouchet avatar maxmouchet commented on June 12, 2024

I close this issue for now. Feel free to reopen if you have another question.

from hmmbase.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.