Most of this book is about the mechanics of GP and its practical use for solving problems. In fact, as will become clear in Chapter 12 , GP has been remarkably successful as a problem-solving and engineering tool. One might wonder how this is possible, given that GP is a non-deterministic algorithm, and as a result its behaviour varies from run to run. It is also a complex adaptive system which sometimes shows intricate and unexpected behaviours (such as bloat). Thus it is only natural to be interested in GP from the scientific point of view. That is, we want to understand why can GP solve problems, how it does it, what goes wrong when it cannot, what are the reasons for certain undesirable behaviours, what can we do to get rid of them without introducing new (and perhaps even less desirable) problems, and so on.
GP is a search technique that explores the space of computer programs. The search for solutions to a problem starts from a group of points (random programs) in this search space. Those points that are above average quality are then used to generate a new generation of points through crossover, mutation, reproduction and possibly other genetic operations. This process is repeated over and over again until a stopping criterion is satisfied. If we could visualise this search, we would often find that initially the population looks like a cloud of randomly scattered points, but that, generation after generation, this cloud changes shape and moves in the search space. Because GP is a stochastic search technique, in different runs we would observe different trajectories. If we could see regularities, these might provide us with a deep understanding of how the algorithm is searching the program space for the solutions, and perhaps help us see why GP is successful in finding solutions in certain runs and unsuccessful in others. Unfortunately, it is normally impossible to exactly visualise the program search space due to its high dimensionality and complexity, making it that much harder to understand.
An alternative approach to better understanding the dynamics of GP is to study mathematical models of evolutionary search. There are a number of cases where this approach has been very successful in illuminating some of the fundamental processes and biases in GP systems. In this chapter we will review several theoretical approaches to understanding GP, including mathematical models of GP (Section 11.1 ), analyses of the structure of GP search spaces (Section 11.2 ), and the use of theory to understand and combat the chronic problem of bloat in a principled fashion (Section 11.3 ).