Friday, August 29, 2008

Elith, J., Leathwick, J. R. & Hastie, T. 2008 A working guide to boosted regression trees. Journal of Animal Ecology 77, 802-813.

A straight forward users guide to a machine learning process known as boosted regression trees (BRT's). BRT's are useful data mining tools, they learn the relationship between a set of variables and an output, for example a field ecologists might want to know which environmental factor's predict the presence of a species. Or I have used BRT's for sensitivity analysis on a simulation model with lots of parameters. Unlike regression approaches (GLM's or GAM's) BRT's (and machine learning in general) deals well with non-linear responses and higher order interactions.

The paper is written for ecologists so it explains BRT's in a way that doesn't require any mathematical ability (which is why I like it) or knowledge of machine learning. Best of all there is a fairly straight forward R package for BRT's, called gbm.

No comments: