Bias Variance Tradeoff

This is my first Knitr document, which lets the user combine R code and text in a single formatted document.

I wanted to have an accessible example that illustrates the bias variance tradeoff.


An illustration of the Bias Variance Tradeoff


An illustration of the Bias Variance Tradeoff

by Gene Leynes
http://geneorama.com/
http://www.linkedin.com/in/geneleynes

Summary

The Bias Variance Tradeoff is an important concept in machine learning. This concept helps you evaluate which model will work the best.

When most people think of fitting a model, something like this comes to mind:
plot of chunk unnamed-chunk-1

Where you basically just draw the best straight line though some points. This paradigm makes it hard to imagine what some one would mean by “model selection”.

The bais varance problem arises when you start to use non linear models that don't have to follow straight lines.

If you consider this data fit with two different smoothing parameters:
plot of chunk unnamed-chunk-2

you can get a sense of the problem.

Intuitively the plot on the left seems to do a better job at representing the information contained in the data… However the model on the right has absolutely no error.

This is the bias variance tradeoff.

Continue reading