Contents Top Previous Next

3.2 Step 2: Function Set

The function set used in GP is typically driven by the nature of the problem domain. In a simple numeric problem, for example, the function set may consist of merely the arithmetic functions (+, -, *, /). However, all sorts of other functions and constructs typically encountered in computer programs can be used. Table  3.1 shows a sample of some of the functions one sees in the GP literature. Sometimes the primitive set includes specialised functions and terminals which are designed to solve problems in a specific problem domain. For example, if the goal is to program a robot to mop the floor, then the function set might include such actions as move, turn, and swish-the-mop.

Table 3.1: Examples of primitives in GP function and terminal sets.

Function Set

Kind of PrimitiveExample(s)

Arithmetic+, *, /
Mathematicalsin, cos, exp
Boolean AND, OR, NOT
 .. . .. .


Terminal Set

Kind of PrimitiveExample(s)

Variablesx, y
Constant values3, 0.45
0-arity functionsrand, go_left

3.2.1 Closure

For GP to work effectively, most function sets are required to have an important property known as closure (Koza1992), which can in turn be broken down into the properties of type consistency and evaluation safety.

Type consistency is required because subtree crossover (as described in Section  2.4 ) can mix and join nodes arbitrarily. As a result it is necessary that any subtree can be used in any of the argument positions for every function in the function set, because it is always possible that subtree crossover will generate that combination. It is thus common to require that all the functions be type consistent, i.e., they all return values of the same type, and that each of their arguments also have this type. For example +, -, *, and / can can be defined so that they each take two integer arguments and return an integer. Sometimes type consistency can be weakened somewhat by providing an automatic conversion mechanism between types. We can, for example, convert numbers to Booleans by treating all negative values as false, and non-negative values as true. However, conversion mechanisms can introduce unexpected biases into the search process, so they should be used with care.

The type consistency requirement can seem quite limiting but often simple restructuring of the functions can resolve apparent problems. For example, an if function is often defined as taking three arguments: the test, the value to return if the test evaluates to true and the value to return if the test evaluates to false. The first of these three arguments is clearly Boolean, which would suggest that if can't be used with numeric functions like +. This, however, can easily be worked around by providing a mechanism to convert a numeric value into a Boolean automatically as discussed above. Alternatively, one can replace the 3-input if with a function of four (numeric) arguments a,b,c,d. The 4-input if implements "If a < b then return value c otherwise return value d".

An alternative to requiring type consistency is to extend the GP system. Crossover and mutation might explicitly make use of type information so that the children they produce do not contain illegal type mismatches. When mutating a legal program, for example, mutation might be required to generate a subtree which returns the same type as the subtree it has just deleted. This is discussed further in Section  6.2 .

The other component of closure is evaluation safety. Evaluation safety is required because many commonly used functions can fail at run time. An evolved expression might, for example, divide by 0, or call MOVE_FORWARD when facing a wall or precipice. This is typically dealt with by modifying the normal behaviour of primitives. It is common to use protected versions of numeric functions that can otherwise throw exceptions, such as division, logarithm, exponential and square root. The protected version of a function first tests for potential problems with its input(s) before executing the corresponding instruction; if a problem is spotted then some default value is returned. Protected division (often notated with %) checks to see if its second argument is 0. If so, % typically returns the value 1 (regardless of the value of the first argument).1 Similarly, in a robotic application a MOVE_AHEAD instruction can be modified to do nothing if a forward move is illegal or if moving the robot might damage it.

An alternative to protected functions is to trap run-time exceptions and strongly reduce the fitness of programs that generate such errors. However, if the likelihood of generating invalid expressions is very high, this can lead to too many individuals in the population having nearly the same (very poor) fitness. This makes it hard for selection to choose which individuals might make good parents.

One type of run-time error that is more difficult to check for is numeric overflow. If the underlying implementation system throws some sort of exception, then this can be handled either by protection or by penalising as discussed above. However, it is common for implementation languages to ignore integer overflow quietly and simply wrap around. If this is unacceptable, then the GP implementation must include appropriate checks to catch and handle such overflows.

3.2.2 Sufficiency

There is one more property that primitives sets should have: sufficiency. Sufficiency means it is possible to express a solution to the problem at hand using the elements of the primitive set.2 Unfortunately, sufficiency can be guaranteed only for those problems where theory, or experience with other methods, tells us that a solution can be obtained by combining the elements of the primitive set.

As an example of a sufficient primitive set consider {AND, OR, NOT, x1, x2, ..., xN}. It is always sufficient for Boolean induction problems, since it can produce all Boolean functions of the variables x1, x2, ..., xN. An example of insufficient set is {+, -, *, /, x, 0, 1, 2}, which is unable to represent transcendental functions. The function exp(x), for example, is transcendental and therefore cannot be expressed as a rational function (basically, a ratio of polynomials), and so cannot be represented exactly by any combination of {+, -, *, /, x, 0, 1, 2}. When a primitive set is insufficient, GP can only develop programs that approximate the desired one. However, in many cases such an approximation can be very close and good enough for the user's purpose. Adding a few unnecessary primitives in an attempt to ensure sufficiency does not tend to slow down GP overmuch, although there are cases where it can bias the system in unexpected ways.

3.2.3 Evolving Structures other than Programs

There are many problems where solutions cannot be directly cast as computer programs. For example, in many design problems the solution is an artifact of some type: a bridge, a circuit, an antenna, a lens, etc. GP has been applied to problems of this kind by using a trick: the primitive set is set up so that the evolved programs construct solutions to the problem. This is analogous to the process by which an egg grows into a chicken. For example, if the goal is the automatic creation of an electronic controller for a plant, the function set might include common components such as integrator, differentiator, lead, lag, and gain, and the terminal set might contain reference, signal, and plant output. Each of these primitives, when executed, inserts the corresponding device into the controller being built. If, on the other hand, the goal is to synthesise analogue electrical circuits, the function set might include components such as transistors, capacitors, resistors, etc. See Section  6.3 for more information on developmental GP systems.

Contents Top Previous Next