Bayes’ Theorem in Plain English
Simplest explanation of Bayes’ Theorem
--
Assume we have a hypothetical machine learning model that has been used to obtain predicted class values (class 0 and class 1) as shown in Table 1 below.
Our system is a binary system with 2 classes, that is, class 0 and class 1, with a total of 10 rows.
Let’s start by defining some basic probabilities (n = 10):
where
- P(Ye = 1) is the probability that the exact value Ye is 1
- P(Ye = 0) is the probability that the exact value Ye is 0
- P(Yp = 1) is the probability that the predicted value Yp is 1
- P(Yp = 0) is the probability that the predicted value is Yp is 0
Bayes’ Theorem for Class 1
Now let us focus on the class 1, then from Table 1 above, we define the following conditional probabilities:
where
- P(Yp = 1|Ye = 1) is the conditional probability that the predicted value Yp =1 given that the exact value Ye = 1
- P(Ye = 1|Yp = 1) is the conditional probability that the exact value Ye = 1 given that the predicted value Yp = 1
Putting these together, we have
Let Ye = 1 be event A and Yp = 1 be event B, then we can rewrite the equation above as
Bayes’ Theorem for Class 0
For class 0, we can define the following conditional probabilities: