Context tree selection: A unifying view


Autoria(s): GARIVIER, A.; LEONARDI, F.
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

19/04/2012

19/04/2012

2011

Resumo

Context tree models have been introduced by Rissanen in [25] as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics. The present paper investigates non-asymptotic properties of two popular procedures of context tree estimation: Rissanen's algorithm Context and penalized maximum likelihood. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning overestimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The under-estimation properties rely on classical hypotheses for processes of infinite memory. These results improve on and generalize the bounds obtained in Duarte et al. (2006) [12], Galves et al. (2008) [18], Galves and Leonardi (2008) [17], Leonardi (2010) [22], refining asymptotic results of Buhlmann and Wyner (1999) [4] and Csiszar and Talata (2006) [9]. (C) 2011 Elsevier B.V. All rights reserved.

MacLinC project[Proc. USP 11.1.9367.1.5]

Fapesp[2009/09411-8]

USP-COFECUB[2009.1.820.45.8]

CNPq[302162/2009-7]

Identificador

STOCHASTIC PROCESSES AND THEIR APPLICATIONS, v.121, n.11, p.2488-2506, 2011

0304-4149

http://producao.usp.br/handle/BDPI/16663

10.1016/j.spa.2011.06.012

http://dx.doi.org/10.1016/j.spa.2011.06.012

Idioma(s)

eng

Publicador

ELSEVIER SCIENCE BV

Relação

Stochastic Processes and their Applications

Direitos

closedAccess

Copyright ELSEVIER SCIENCE BV

Palavras-Chave #Algorithm Context #Penalized maximum likelihood #Model selection #Variable length Markov chains #Bayesian information criterion #Deviation inequalities #PROBABILISTIC SUFFIX TREES #CONSISTENCY #CHAINS #MEMORY #ORDER #MDL #Statistics & Probability
Tipo

article

original article

publishedVersion