AAAI Publications, Twenty-Fifth AAAI Conference on Artificial Intelligence

Font Size: 
Incorporating Boosted Regression Trees into Ecological Latent Variable Models
Rebecca A. Hutchinson, Li-Ping Liu, Thomas G. Dietterich

Last modified: 2011-08-04


Important ecological phenomena are often observed indirectly. Consequently, probabilistic latent variable models provide an important tool, because they can include explicit models of the ecological phenomenon of interest and the process by which it is observed. However, existing latent variable methods rely on hand-formulated parametric models, which are expensive to design and require extensive preprocessing of the data. Nonparametric methods (such as regression trees) automate these decisions and produce highly accurate models. However, existing tree methods learn direct mappings from inputs to outputs — they cannot be applied to latent variable models. This paper describes a methodology for integrating nonparametric tree methods into probabilistic latent variable models by extending functional gradient boosting. The approach is presented in the context of occupancy-detection (OD) modeling, where the goal is to model the distribution of a species from imperfect detections. Experiments on 12 real and 3 synthetic bird species compare standard and tree-boosted OD models (latent variable models) with standard and tree-boosted logistic regression models (without latent structure). All methods perform similarly when predicting the observed variables, but the OD models learn better representations of the latent process. Most importantly, tree-boosted OD models learn the best latent representations when nonlinearities and interactions are present.

Full Text: PDF