Abstract
Daniell Toth and John L. Eltinge (2010) "Building Consistent Regression Trees from Complex Sample Data"
In the past several years the statistical literature has developed a wide range of methods for
the construction of regression trees and other estimators based on the recursive partitioning
of a sample. Many prospective applications involve data collected through a complex sample
design. At present, however, relatively little is known regarding the properties of these
methods under complex designs. This paper proposes a method for incorporating information
about the complex sample design when building a regression tree using a recursive partitioning
algorithm. Sufficient conditions are established which guarantee asymptotic design
L2 consistency of these regression trees as an estimator for an arbitrary regression function.
The proposed method is illustrated with Occupational Employment Statistics establishment
survey data linked to Quarterly Census of Employment and Wage payroll data of the Bureau
of Labor Statistics. Performance of the nonparametric estimator is investigated through a
simulation study based on this example.
|