As discussed in Section 3.2.1 , most GP systems require type consistency where all subtrees return data of the same type. This ensures that the output of any subtree can be used as one of the inputs to any node. The basic subtree crossover operator shuffles tree components entirely randomly. Universal type compatibility ensures that crossover cannot lead to incompatible connections between nodes. This is also required to stop mutation from producing illegal programs.
An implicit assumption underlying this approach is that all combinations of structures are equally likely to be useful. In many cases, however, we know in advance that there are constraints on the structure of the solution, or we have strong suspicions about the likely form solutions will take. In this section, we will look at several different systems that use tools such as types and grammars to bias or constrain search with the primary aim of increasing the chance of finding a suitable program.
A problem domain might be naturally represented with multiple types. This suggests that the functions used by GP and their arguments will not necessarily be all of the same type. This can often be addressed through creative definitions of functions and implicit type conversion. For example, the Odin system (Holmes and Barclay, 1996) defines operations on inappropriate types to return a new fail object. These are handled by introducing a binary fatbar that returns its first argument unless it is fail, in which case it returns its second argument.
This sort of approach may not always be desirable. For example, if a key goal is to evolve solutions that can be easily understood or analysed, then one might prefer a GP system that is constrained structurally or via a type system, since these often generate results that are more comprehensible (Haynes, Wainwright, Sen, and Schoenefeld, 1995), (Langdon, 1998, page 126). Similarly, if there is domain knowledge that strongly suggests a particular syntactic constraint on the solution, then ignoring that constraint may make it much harder to find a solution.
We will focus here on three different approaches to constraining the syntax of the evolved expression trees in GP: simple structure enforcement (Section 6.2.1 ), strongly typed GP (Section 6.2.2 ) and grammarbased constraints (Section 6.2.3 ). Finally, we consider the advantages and disadvantages of syntactic and type constraints and their biases (Section 6.2.4 ).
6.2.1 Enforcing Particular Structures
If a particular structure is believed or known to be important then one can modify the GP system to require that all individuals have that structure (Koza, 1992). For example, if a problem is believed to have (or require) a periodic solution, one might want to consider constraining the search to solutions of the form a × sin(b × t). By allowing a and b to evolve freely but keeping the rest of the structure fixed, one could restrict GP to evolving expressions that are periodic. Syntax restrictions can also be used to make GP follow sensible engineering practices. For example, we might want to ensure that loop control variables appear in the correct parts of for loops and nowhere else (Langdon, 1998, page 126).
Enforcing a user specified structure on the evolved solutions can be implemented in a number of ways. One could ensure that all the initial individuals have the structure of interest (for example, generating random subtrees for a and b while fixing the rest) and then constrain crossover and mutation so that they do not alter any of the fixed regions of a tree. An alternative approach would be to evolve the various (sub)components separately. One could evolve pairs of trees (a,b) (like ADFs). Alternatively, one could have two separate populations, one of which is used to evolve candidates for a while the other is evolving candidates for b.
A form of constraintdirected search in GP was also proposed in (??) to help GP to focus on more promising areas of the space.
Since constraints are often driven by or expressed using a type system, a natural approach is to incorporate types and their constraints into the GP system (Montana, 1995). In strongly typed GP, every terminal has a type, and every function has types for each of its arguments and a type for its return value. The process that generates the initial, random expressions, and all the genetic operators are implemented so as to ensure that they do not violate the type system's constraints.
Returning to the if example from Section 3.2.1 (page 55 ), we might have an application with both numeric and Boolean terminals (e.g., get_speed and is_food_ahead). We might then have an if function that takes three arguments: a test (Boolean), the value to return if the test is true, and the value to return if the test is false. Assuming that the second and third values are numbers, then the output of the if is also going to be numeric. If we choose the test argument as a crossover point in the first parent, then the subtree (excised from the second parent) to insert must have a Boolean output. That is, we must find either a function which returns a Boolean or a Boolean terminal in the other parent tree to be the root of the subtree which we will insert into the new child. Conversely if we choose either the second or third argument as a crossover point in the first parent, then the inserted subtree must be numeric. In all three cases, given that both parents are type correct, restricting the second crossover point in this way ensures the child will also be type correct.
This basic approach to types can be extended to more complex type systems including simple generics (Montana, 1995), multilevel type systems (Haynes, Schoenefeld, and Wainwright, 1996), fully polymorphic types (Olsson, 1994), and polymorphic higherorder type systems (?).
6.2.3 Grammarbased Constraints
Another natural way to express constraints is via grammars, and these have been used in GP in a variety of ways (Gruau, 1996; Hoai, McKay, and Abbass, 2003; O'Neill and Ryan, 2003; ?; ?). Many of these simply use a grammar as a means of expressing the kinds of constraints discussed above in Section 6.2.1 . For example, one could enforce the structure for the period function using a grammar such as the following:
Each line in this grammar is known as a rewrite rule or a production rule. Elements that cannot be rewritten are known as the terminals of the grammar1 while symbols that appear on the lefthandside of a rule are known as nonterminal symbols.In this sort of system, the grammar is typically used to ensure the initial population is made up of legal "grammatical" programs. The grammar is also used to guide the operations of the genetic operators. Thus we need to keep track not only of the program itself, but also the syntax rules used to derive it.
What actually is evolved in a grammarbased GP system depends on the particular system. ?, for example, evolved derivation trees, which effectively are a hierarchical representation of which rewrite rules must be applied, and in which order, to obtain a particular program. Figure 6.2 shows an example of a derivation tree representing a grammatical program with respect to the grammar in Equation ( 6.3 ). In this system, crossover is restricted to only swapping subtrees deriving from a common nonterminal symbol in the grammar. So, for example, a subtree rooted by an E node could be replaced by another also rooted by an E, while an Erooted subtree could not be replaced by an oprooted one.
Figure 6.2: Example individual (a derivation tree) that might be evolved in Whigham's grammarbased GP system (?) if the grammar in Equation ( 6.3 ) was used. Rectangles represent nonterminal symbols of the grammar. 
The actual program represented by a derivation tree can be obtained by reading out the leaves of the tree one by one from left to right. For the derivation tree in Figure 6.2 , for example, this produces the program
However, for efficiency reasons, in an actual implementation it is not convenient to extract the program represented by a derivation tree is this way. This is because programs need to be executed in order to evaluate their fitness, and this flat program representation often requires further transformations before execution. It is, therefore, common to directly convert a derivation tree into a standard GP tree.
Grammarbased GP approaches can be extended by incorporating concepts from computational linguistics. For example, McKay and colleagues used tree adjoining grammars (TAGs) (Joshi and Schabes, 1997) to design new genetic representations and operators that respect grammatical constraints while allowing new types of structural modifications (Hoai and McKay, 2004; Hoai et al., 2003; Hoai, McKay, Essam, and Hao, 2005).
Another major grammarbased approach is grammatical evolution (GE) (O'Neill and Ryan, 2003; Ryan, Collins, and O'Neill, 1998). GE does not use trees, instead it represents individuals as variablelength sequences of integers (cf. Equation 6.4 ) which are interpreted in the context of a user supplied grammar.
For each rule in the grammar, the set of alternatives on the right hand side are numbered from 0 upwards. In the example grammar in Equation ( 6.3 ) above, the first rule only has one option on the right hand side; so this would be numbered 0. The second rule has two options, which would be numbered 0 and 1. The third rule has four options which would be numbered 0 to 3. Finally the fourth rule has three options numbered 0 to 2. To create a program from a GE individual one uses the values in the individual to "choose" which alternative to take in the production rules. For example, suppose a GE individual is represented by the sequence
 (6.4) 
then we start with 39 and the first syntax rule, tree. However tree has no alternatives, so we move to 7 and rule E. Now E has two alternatives and 7 is used (via modulus) to chose between them. More of the translation process is given in Figure 6.3 .

In this example we did not need to use all the numbers in the sequence to generate a complete program. Indeed the last integer, 94, was not used. In general, "extra" genetic material is simply ignored. More problematic is when a sequence is "too short" in the sense that the end of the sequence is reached before the translation process is complete. There are a variety of options in this case, including failure (assigning this individual the worst possible fitness) and wrapping (continuing the translation process, moving back to the front of the numeric sequence). Grammatical evolution has been very successful and is widely used.
One of the common arguments in favour of constraint systems like types and grammars is that they limit the search space by restricting the kind of structures that can be constructed. While this is true, it can come at a price.
An expressive type system typically requires more complex machinery to support it. It often makes it more difficult to generate typecorrect individuals in the initial population or during mutation and it is more difficult to find crossover points that do not violate the type system. In an extreme case like, constructive type theory (?), the type system is so powerful that it can completely express the formal specification of the program. Thus, any program/expression having this type is guaranteed to meet that specification. In GP this would mean that all the members of the initial population would need to be solutions to the problem, thus putting all the problem solving burden on the initialisation phase and removing the need for any evolution at all! Even without such extreme constraints, it has often been found necessary to develop additional machinery in order to efficiently generate an initial population that satisfies the constraints (Montana, 1995; Ratle and Sebag, 2000; Schoenauer and Sebag, 2001; ?).
As a rule, systems that focus on syntactic constraints (such as grammarbased systems) require less machinery than those that focus on semantic constraints (such as type systems), since it is typically easier to satisfy the syntactic constraints in a mechanistic fashion. For example, grammarbased systems, such as grammatical evolution and the various TAGbased systems, are typically simple to initialise, and mutation and crossover need to enforce few, if any, constraints on the new child. The work (and the bias) in these systems is much more in the choice of the grammar, and once it has been designed, there is often little additional work required of the practitioner or the GP system to enforce the implied constraints.
While a constraint system may limit the search space in valuable ways (Ratle and Sebag, 2000) and can improve performance on interesting problems (Hoai, McKay, and Essam, 2006), there is no general guarantee that constraint systems will make the evolutionary search process easier. There is no broad assurance that a constraint will increase the density of solutions or (perhaps more importantly) approximate solutions.2 Also, while there are cases where constraint systems smooth the search landscape (Hoai et al., 2006), it is also possible for constraint systems to make the search landscape more rugged by preventing genetic operations from creating intermediate forms on potentially valuable evolutionary paths. In the future, it might be useful to extend solution density studies such as those summarised in (Langdon and Poli, 2002) to the landscapes generated by constraint systems in order to better understand the impact of these constraints on the underlying search spaces.
In summary, while types, grammars, and other constraint systems can be powerful tools, all such systems carry biases. One therefore needs to be careful to explore the biases introduced by the constraints and not simply assume that they are beneficial.