I would like to solve exercise c) and d) of the attached problem with HMMBase.jl. I'm

Hi, In your case, the emission matrix O</c

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

HMM with observations as probabilities about hmmbase.jl HOT 13 CLOSED

maxmouchet commented on June 12, 2024

HMM with observations as probabilities

from hmmbase.jl.

Comments (13)

AndreasAZiegler commented on June 12, 2024 1

Thanks a lot for the explanation. Your way of explaining it is way better than in the textbook, I have.

from hmmbase.jl.

maxmouchet commented on June 12, 2024

Hi,

In your case, the emission matrix O gives the probabilities P(Y_t = k | Z_t = i) = O_{ik}.
To model that using Distributions.jl you can use a Categorical distribution (one for each state).
It is parametrized by a probability vector where element i is the probability of symbol i.

See the examples below.
Hope this helps!

Maxime

julia> using Distributions

help?> Categorical
search: Categorical ncategories

  Categorical(p)

  A Categorical distribution is parameterized by a probability vector p (of length K).

P(X = k) = p[k]  \quad \text{for } k = 1, 2, \ldots, K.

using Distributions
using HMMBase

# Initial Probabilities
a = [0.3, 0.2, 0.5]

# Transition Matrix
A = [0 0.2 0.8; 1 0 0; 0.4 0.4 0.2]

# Observations Distributions (for each state)
B1 = Categorical([0, 0.5, 0.5])
B2 = Categorical([1, 0, 0])
B3 = Categorical([0.1, 0.2, 0.7])

# Example: probability of observing O3 in state 1
@show pdf(B1, 3)
# pdf(B1, 3) = 0.5

# Build the HMM
hmm = HMM(a, A, [B1, B2, B3])

# Observations
Yt = [1, 3, 1]

# Question c)
# `loglikelihood` uses the forward algorithm to compute the sequence probability.
@show exp(loglikelihood(hmm, Yt))
# exp(loglikelihood(hmm, Yt)) = 0.03374000000000003

# Question d)
@show viterbi(hmm, Yt)
# viterbi(hmm, Yt) = [2, 1, 2]

from hmmbase.jl.

AndreasAZiegler commented on June 12, 2024

Thanks a lot for your quick response and the example. It would have taken me hours to figure this out on my own.

The output of the viterbi algorithms comes to the same results as I came by hand calculation (the solution got the second state wrong).

However, by using forward, backward and (log)likelihood(s) I don't manage to come to the same numbers as I got with my hand calculation (here I think the solutions did not make any mistake). Can you guess where the difference comes from?

from hmmbase.jl.

maxmouchet commented on June 12, 2024

Sorry I had not seen the correction in the pdf.

In my example I used the transition matrix provided in question a) without verifying.
But indeed it does not correspond to the transition diagram (P_{12} should be 0.8 not 0.2, there should be no self-transition in S3, etc.).

If we use the transition matrix that correspond to the diagram (the one provided in the correction), we get the right results :)

I will also document this use case (HMM from emission matrix) to make it easier for the users.

using Distributions
using HMMBase

# Initial Probabilities
a = [0.3, 0.2, 0.5]

# Transition Matrix
A = [0 0.8 0.2; 0.6 0 0.4; 0.0 1.0 0.0]
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

# Observations Distributions (for each state)
B1 = Categorical([0, 0.5, 0.5])
B2 = Categorical([1, 0, 0])
B3 = Categorical([0.1, 0.2, 0.7])

# Build the HMM
hmm = HMM(a, A, [B1, B2, B3])

# Observations
Yt = [1, 3, 1]

# Question c)
@show exp(loglikelihood(hmm, Yt))
# exp(loglikelihood(hmm, Yt)) = 0.10519999999999999 ≈ 0.1052

# Question d)
@show viterbi(hmm, Yt)
# viterbi(hmm, Yt) = [2, 3, 2]

from hmmbase.jl.

maxmouchet commented on June 12, 2024

Also, if you get different results in the forward/backward algorithm, this is because in HMMBase (and most forward/backward implementations), we scale the values so that they sum to 1 (for numerical stability).

forward(hmm, Yt)[1]
# 3×3 Array{Float64,2}:
#  0.0       0.8       0.2
#  0.517241  0.0       0.482759
#  0.0       0.988593  0.0114068

# Correction is unscaled. HMMBase is scaled.
# t = 1
# Unscaled: 0 0.2      0.05      (sum = 0.25)
# Scaled:   0 0.2/0.25 0.05/0.25
#      =    0 0.8      0.2

# t = 2
# Unscaled: 0.06       0 0.056      (sum = 0.116)
# Scaled:   0.06/0.116 0 0.056/0.116
#      =    0.517241   0 0.482759
# ...

from hmmbase.jl.

AndreasAZiegler commented on June 12, 2024

Thanks again for the thorough explanation. I also read that scaling is important to avoid numerical issues especially in larger HMM. As you state that HMMBase also scales, I assume there is no way to recover the unscaled values, or is there?

from hmmbase.jl.

maxmouchet commented on June 12, 2024

There is currently no way.
I'll try to add this possibility in a future version, at-least for educational purpose.

from hmmbase.jl.

maxmouchet commented on June 12, 2024

I've added a new constructor so that you can build an HMM directly from an emission matrix (instead of building categorical distributions by hand).

Using your example:

a = [0.3, 0.2, 0.5]
A = [0 0.8 0.2; 0.6 0 0.4; 0.0 1.0 0.0]
B = [0 0.5 0.5; 1.0 0 0; 0.1 0.2 0.7]
hmm = HMM(a, A, B)

from hmmbase.jl.

AndreasAZiegler commented on June 12, 2024

Wow, pretty cool. That definitively saves some typing.

from hmmbase.jl.

AndreasAZiegler commented on June 12, 2024

@maxmouchet I calculated the viterbi calculations by hand and don't get the same at t=2 as your implementation. The implementation takes state 3 with p=0.056 over state 1 with p=0.06. Is there some numerical issue or do i misunderstand the viterbi algorithm?

from hmmbase.jl.

maxmouchet commented on June 12, 2024

In the Viterbi algorithm the state sequence is inferred backwards, starting from the last state.
Your computation for V_1 is correct but for V_2 you also need to record the ancestors.
That is, argmax_i P_{ij} V_1(i), or more generally, A[t,j] = argmax_i P_{ij} V_{t-1}(i).

For V_2(1), P_{i1} V_1(i) is maximized when i = 2, so the ancestor is 2.
For V_2(2), P_{i2} V_1(i) is maximized when i = 3, so the ancestor is 3.
For V_2(3), P_{i3} V_1(i) is maximized when i = 2, so the ancestor is 2.
And so on for V_3(.).

At the end you should get the following ancestors:

A[2,1] = 2
A[2,2] = 3
A[2,3] = 2
A[3,1] = _
A[3,2] = 3
A[3,3] = 2

Finally, to find the state sequence, we start from z_T = argmax_j V_T(j), and z_t = A[t+1, z_{t+1}].
Here z_3 = 2, z_2 = A[3,2] = 3 and z_1 = A[2,3] = 2.

from hmmbase.jl.

maxmouchet commented on June 12, 2024

A good introduction to HMMs is https://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/tutorial%20on%20hmm%20and%20applications.pdf.

from hmmbase.jl.

maxmouchet commented on June 12, 2024

I close this issue for now. Feel free to reopen if you have another question.

from hmmbase.jl.

HMM with observations as probabilities about hmmbase.jl HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent