Understanding Markov Chain by Comparing “First Order Sequence Model” and “Second Order Sequence Model”

Paul Xiong
Apr 27, 2023

--

First, let’s assuming we have a book, count follow sentences in a book:

  • Check whether the battery ran down please.
  • Check whether the program ran please.

Totally, they appeared for 100 times; the first line appeared for 40 times and second for 60 times.

Our goal here is to predict next word after word “ran”.

  • First Order Sequence: only look at “ran” , so next word could be “down” (.4 probability) and “please”(.6 probability).
pic shows 3 steps(1,2,3) to look at how to find the corresponding probabilities.
  • Second Order Sequence: look at two words, “ran down” + “please” (1 probability) and “program ran” + “please” (1 probability). Comparing with above .4 and .6, the predicting rate is improved.

--

--

Paul Xiong
Paul Xiong

Written by Paul Xiong

Predicting the next word (token) is what powers ChatGPT, while predicting the next photo (embedding) forms the foundation of ImageGPT.

No responses yet