The number of paths needed to calculate
increases
exponentially as the length of the observation sequence
increases but the
's at time t-1 give the probability of reaching that
state through all previous paths, and we can therefore define
's at time t
in terms of those at time t-1 -i.e.,
Thus we calculate the probabilities as the product of the
appropriate observation probability (that is, that state j
provoked what is actually seen at time t+1) with the sum of
probabilities of reaching that state at that time - this latter
comes from the transition probabilities together with a from the
preceding stage.