Contents Top Previous Next

5.1 Constructing the Initial Population

Koza's ramped half-and-half method is the most common way of creating the initial GP population (cf. Section  2.2 , page  28 ). However, there are several other ways of constructing a collection of random trees. In Section  5.1.2 we will briefly consider an unexpected impact of population initialisation. There has also been some work with non-random or informed starting points (cf. Section  5.1.3 ).

5.1.1 Uniform Initialisation

The shape of the initial trees can be lost within a few generations (more on this below). However, a good start given by the initial population can still be crucial to the success of a GP run. In general, there are an infinite number of possible computer programs. This means that it is impossible to search them uniformly. Therefore, any method used to create the initial population will have a bias. For example, ramped half-and-half tends to create bushy trees. Such trees have a higher proportion of solutions to symmetric problems, such as parity. Conversely, the smallest solution to the Sante Fe ant trail-following problem is more randomly shaped (Langdon and Poli1998a). This is partly why ramped half-and-half is very poor at finding programs which can navigate the Sante Fe trail. Another reason is that many of the programs generated by ramped half-and-half (with standard parameters) are simply too small. Chellapilla ( 1997a) claims good results when the size of the initial trees was more tightly controlled.

Iba (1996a) and Bohm and Geyer-Schulz (1996) report methods to precisely sample trees uniformly based on Alonso's bijective algorithm (Alonso and Schott1995). Although this algorithm has been criticised (Luke2000) for being computationally expensive, it can be readily used in practice. Langdon ( 2000) introduced the ramped uniform initialisation which extends Alonso's bijective algorithm by allowing the user to specify a range of initial tree sizes. It then generates equal numbers of random trees for each length in the chosen range. (C++ code can be obtained from ftp://cs.ucl.ac.uk/genetic/gp-code/rand_tree.cc.)

With these more "uniform" initialisations, most trees are asymmetric with some leaves very close to the root of the tree. This is quite different from the trees generated by ramped half-and-half which are on average some distance from the root. Uniform sampling may be better in problems where the desired solutions are asymmetric with some leaves being much more important than others. For example, in data mining it is common to look for solutions with a few dominant variables (which may be close to the root node) whilst other variables are of little or no interest and may be some distance from the root (or indeed not present in the tree). On the other hand, problems like multiplexer or parity require all the inputs to be used and are of similar importance. Bushier trees may be better at solving such problems.

5.1.2 Initialisation may Affect Bloat

Crossover has a strong preference for creating a very non-uniform distributions of tree sizes (Poli, Langdon, and Dignum2007). Crossover generates very short programs much more often than longer ones. Selection can only partially combat this tendency. Typically, crossover will totally rearrange the size and shape of the initial trees within a few generations. As discussed in Section  11.3.1 (page  224 ), the excessive sampling of short programs appears to be an important cause of bloat (the uncontrolled growth of programs during GP runs, which will be described in more detail in Section  11.3 , page  224 onwards). It has been shown (Dignum and Poli2007) that when the initial population is created with the size distribution preferred by crossover (see Section  11.3.1 ), bloat is more marked. The distribution has a known mathematical formula (it is a Lagrange distribution of the second kind), but in practice it can be created by simply performing multiple rounds of crossover on a population created in the traditional way before the GP run starts. This is known as Lagrange initialisation. These findings suggest that initialisation methods which tend to produce many short programs may in fact induce bloat sooner than methods that produce distributions more skewed towards larger programs.

5.1.3 Seeding

The most common way of starting a GP run from an informed non-random point is seeding the initial population with an individual which, albeit not a solution, is thought to be a good starting point. Such a seed may have been produced by an earlier GP run or perhaps constructed by the user (Aler, Borrajo, and Isasi2002Holmes1995Hsu and Gustafson2001Langdon and Nordin2000Langdon and Treleaven1997?). However, Marek, Smart, and Martin (2002) reported that hand written programs may not be robust enough to prosper in an evolving population.

One point to be careful of is that such a seed individual is liable to be much better than randomly created trees. Thus, its descendants may take over the population within a few generations. So, under evolution the seeded population is initially liable to lose diversity rapidly. Furthermore, depending upon the details of the selection scheme used, a single seed individual may have some chance of being removed from the population. Both problems are normally dealt with by filling the whole population with either identical or mutated copies of the seed. This method creates a low diversity initial population in a controlled way, thereby avoiding the initial uncontrolled loss of diversity associated with single seeds. Furthermore, with many copies of the seed, few selection methods will have much chance of removing all copies of the seed before they are able to create children. Diversity preserving techniques, such as multi-objective GP (e.g., (Parrott, Li, and Ciesielski2005), (Setzkorn2005) and Chapter  9 ), demes (Langdon1998) (see Section  10.3 ), fitness sharing (Goldberg1989) and the use of multiple seed trees, might also be good cures for the problems associated with the use of a single seed. In any case, the diversity of the population should be monitored to ensure that there is significant mixing of different initial trees.

Contents Top Previous Next