Confidence intervals for the random forest generalization error

Unidades Organizacionais

Resumo

We show that the byproducts of the standard training process of a random forest yield not only the well known and almost computationally free out-of-bag point estimate of the model generalization error, but also open a direct path to compute confidence intervals for the generalization error which avoids processes of data splitting and model retraining. Besides the low computational cost involved in their construction, these confidence intervals are shown through simulations to have good coverage and appropriate shrinking rate of their width in terms of the training sample size.

Palavras-chave

Random forests; Generalization error; Out-of-bag estimation; Confidence interval; Bootstrapping
Vínculo institucional

Titulo de periódico

Pattern Recognition Letters
DOI

Título de Livro

URL na Scopus

Idioma

Inglês

Notas

Membros da banca

Área do Conhecimento CNPQ

CIENCIAS SOCIAIS APLICADAS

Citação

Avaliação

Revisão

Suplementado Por

Referenciado Por