Publications

A Statistical Learning Perspective of Genetic Programming

Nur Merve Amil, Nicolas Bredeche, Christian Gagné, Sylvain Gelly, Marc Schoenauer and Olivier Teytaud

Abstract - This paper proposes a theoretical analysis of Genetic Programming (GP) from the perspective of statistical learning theory, a well grounded mathematical toolbox for machine learning. By computing the Vapnik-Chervonenkis dimension of the family of programs that can be inferred by a speciﬁc setting of GP, it is proved that a parsimonious ﬁtness ensures universal consistency. This means that the empirical error minimization allows convergence to the best possible error when the number of test cases goes to inﬁnity. However, it is also proved that the standard method consisting in putting a hard limit on the program size still results in programs of inﬁnitely increasing size in function of their accuracy. It is also shown that cross-validation or hold-out for choosing the complexity level that optimizes the error rate in generalization also leads to bloat. So a more complicated modiﬁcation of the ﬁtness is proposed in order to avoid unnecessary bloat while nevertheless preserving universal consistency.

Bibtex:

@inproceedings{Bredeche773,
author    = { Nur Merve Amil and Nicolas Bredeche and Christian Gagné and Sylvain Gelly and Marc Schoenauer and Olivier Teytaud },
title     = { A Statistical Learning Perspective of Genetic Programming },
booktitle = { Proc. of European Conference on Genetic Programming (EuroGP 2009) },
year      = { 2009 },
location  = { Tubingen, Germany }
}

Last modification: 2009/04/15 by cgagne

		Home \| About \| People \| Research \| Publications \| Events \| Profile
		©2002-. Computer Vision and Systems Laboratory. All rights reserved