# Probabilistic Inference with Naive Bayes

Saturday, 23 January 2016 · 21 min read · machine-learningToday, let’s take a look at Naive Bayes from the ground up, and work through a working implementation in Javascript.

For one of my side projects, I’m working on a feature that lets you perform simple data analysis on small datasets via the browser. One use case is the automatic assignment of categories to a set of items (i.e. supervised learning.)

Since users’ datasets are small (N<60), we want to keep our model as simple as possible so it won’t overfit. This usually means making simple assumptions about the distribution the data comes from.

There are a handful of machine learning classifiers we can use: Naive Bayes, k-Nearest Neighbour, and linear SVM. I also considered using Decision Trees and MLP which we can visualize on the browser using client-side visualization libraries.

Naive Bayes is a simple probabilistic classifier that works despite the independence assumption, and has historically been used in document classification.

## Intro to Bayes Theorem

I’m going to assume you know some basic probabilities.

As a refresher, here are a few definitions:

- Let
`A`

and`B`

be events that can happen with some probability `P(A)`

: probability that A happens`P(A|B)`

: probability that A happens, given that B happens (conditional probability)

According to Bayes Theorem:

That is, the probability `P(A|B)`

of `A`

happening given `B`

is equal to the probability `P(B|A)`

of `B`

happening given `A`

AND the probability `P(A)`

of `A`

happening OVER the probability `P(B)`

of `B`

happening.

To clarify this idea further, let’s work through an example:

A patient takes a lab test and the result comes back positive. The test returns a correct positive result in only 98% of the cases in which the disease is actually present, and a correct negative result in only 97% of the cases in which the disease is not present. Also, .008 of the entire population have this cancer.

For a new patient that lab test returns a positive result, should he be diagnosed as having cancer or not?

Based on the above, we have the following probabilities:

```
P(cancer) = 0.008 P(!cancer) = 0.992
P(+|cancer) = 0.98 P(-|cancer) = 0.02
P(+|!cancer) = 0.03 P(-|cancer) = 0.97
```

We can then calculate the conditional probabilities of patients having cancer:

```
P(cancer|+) = P(+|cancer)P(cancer) / P(+)
P(!cancer|+) = P(+|!cancer)P(!cancer) / P(+)
P(+|cancer)P(cancer) = 0.0078
P(+|!cancer)P(!cancer) = 0.0298
P(cancer|+) = 0.0078 / (0.0078 + 0.0298) = 0.21
P(!cancer|+) = 0.79
```

Since `P(!cancer|+) > P(cancer|+)`

, we conclude that if a new patient gets a positive test result, he has a higher likelihood of having cancer.

Bayesian learning algorithms calculate

explicit probabilitiesfor hypotheses, which typically requires initial knowledge of many probabilities. What this means is that you’ll need labeled data beforehand (supervised learning.)

## Supervised Learning with Naive Bayes

The above example works for a single attribute: a test result. However, in real-world cases we’ll often need more attributes in order to improve the accuracy of our predictions. That’s where supervised learning comes in.

A Naive Bayes classifier is a simple classification method that is based on Bayes’ Theorem and the assumption of conditional independence.

That is:

```
P(A ^ B | Y) = P(A | Y) P(B | Y)
```

Now, let’s say our training data looks as follows:

```
a1, ..., aN, Vj
```

Where `ai`

is an attribute at the `i`

th position, and `Vj`

is the `j`

th class label in the dataset. According to our conditional independence assumption:

`P(a1, ..., aN|Vj) = P(a1|Vj) P(a2|Vj) ... P(aN|Vj)`

According to Bayes’ Theorem, we’ll then be able to find:

```
P(Vj|a1, ..., aN)
```

which is the probability that a given sample of attributes `a1, ..., AN`

belongs to a class `Vj`

. This is the end result of our classification!

Notice that Naive Bayes is an

eagerlearner: it calculates all the probabilities beforehand in a preprocessing step, making classification time close to zero. This is in contrast tolazylearners such as k-nearest neighbours with less training time and more classification time.

## Naive Bayes Learner Implementation

Given the following training data, let’s build a Naive Bayes classifier to answer the following supervised learning question: is today a good day to `Play`

outside?

Outlook | Temperature | Humidity | Wind | Play? |
---|---|---|---|---|

Sunny | 85 | 85 | Weak | No |

Sunny | 80 | 90 | Strong | No |

Overcast | 83 | 86 | Weak | Yes |

Rain | 70 | 96 | Weak | Yes |

Rain | 68 | 80 | Weak | Yes |

Rain | 65 | 70 | Strong | No |

Overcast | 64 | 65 | Strong | Yes |

Sunny | 72 | 95 | Weak | No |

Sunny | 69 | 70 | Weak | Yes |

Rain | 75 | 80 | Weak | Yes |

Sunny | 75 | 70 | Strong | Yes |

Overcast | 72 | 90 | Strong | Yes |

Overcast | 81 | 75 | Weak | Yes |

Rain | 71 | 91 | Strong | No |

As a recap, we want to calculate all the individual probabilities

`P(ai | Vj)`

, so that we can calculate`P(Vj | ai..aN)`

## Implementation

The full implementation can be seen on Github: node-bayes

## Handling numeric attributes

One consideration we have to take into account is how we can handle numerical attributes such as `Temperature`

and `Humidity`

in our dataset.

We calculate the probability of a yet unseen numerical value through a normal distribution, specifically a calculated mean & standard deviation of the sample values for the column.

## Node.js module

I’ve written a Node.js module you can use to perform the above training and classification steps:

You can view the source on Github. Pull requests are welcome!

## In Python

In the Python data analysis ecosystem, there are a breadth of libraries you can use that contains implementations of Naive Bayes, such as `GaussianNB()`

from `sklearn`

:

## In Closing

Naive Bayes is a simple but powerful supervised learning algorithm. Let me know what you think! Pull Requests to `node-bayes`

are welcome.

*📬* **Get updates straight to your inbox.**

Subscribe to my newsletter to make sure you don't miss anything.