Contents Top Previous Next

2.2 Initialising the Population

Like in other evolutionary algorithms, in GP the individuals in the initial population are typically randomly generated. There are a number of different approaches to generating this random initial population. Here we will describe two of the simplest (and earliest) methods (the full and grow methods), and a widely used combination of the two known as Ramped half-and-half.

In both the full and grow methods, the initial individuals are generated so that they do not exceed a user specified maximum depth. The depth of a node is the number of edges that need to be traversed to reach the node starting from the tree's root node (which is assumed to be at depth 0). The depth of a tree is the depth of its deepest leaf (e.g., the tree in Figure  2.1 has a depth of 3). In the full method (so named because it generates full trees, i.e. all leaves are at the same depth) nodes are taken at random from the function set until the maximum tree depth is reached. (Beyond that depth, only terminals can be chosen.) Figure  2.3 shows a series of snapshots of the construction of a full tree of depth 2. The children of the * and / nodes must be leaves or otherwise the tree would be too deep. Thus, at both steps t = 3, t = 4, t = 6 and t = 7 a terminal must be chosen (x, y, 1 and 0, respectively).


Figure 2.3: Creation of a full tree having maximum depth 2 using the full initialisation method (t = time).

Although, the full method generates trees where all the leaves are at the same depth, this does not necessarily mean that all initial trees will have an identical number of nodes (often referred to as the size of a tree) or the same shape. This only happens, in fact, when all the functions in the primitive set have an equal arity. Nonetheless, even when mixed-arity primitive sets are used, the range of program sizes and shapes produced by the full method may be rather limited. The grow method, on the contrary, allows for the creation of trees of more varied sizes and shapes. Nodes are selected from the whole primitive set (i.e., functions and terminals) until the depth limit is reached. Once the depth limit is reached only terminals may be chosen (just as in the full method). Figure  2.4 illustrates this process for the construction of a tree with depth limit 2. Here the first argument of the + root node happens to be a terminal. This closes off that branch preventing it from growing any more before it reached the depth limit. The other argument is a function (-), but its arguments are forced to be terminals to ensure that the resulting tree does not exceed the depth limit. Pseudocode for a recursive implementation of both the full and grow methods is given in Algorithm  2.1 .


Figure 2.4: Creation of a five node tree using the grow initialisation method with a maximum depth of 2 (t = time). A terminal is chosen at t = 2, causing the left branch of the root to be closed at that point even though the maximum depth had not been reached.

procedure: gen_rnd_expr(func_set,term_set,max_d,method)

1: if max_d = 0 or (method = grow and rand() < ----|termxset|----- |termxset|+|funcxset|) then

2: expr = choose_random_element( term_set )

3: else

4: func = choose_random_element( func_set )

5: for  i = 1 to arity(func) do

6: arg_i = gen_rnd_expr( func_set, term_set, max_d - 1, method );

7: end for

 8: expr = (func, arg_1, arg_2, ...);

9: end if

 10: return  expr

Notes: func_set is a function set, term_set is a terminal set, max_d is the maximum allowed depth for expressions, method is either full or grow, expr is the generated expression in prefix notation and rand() is a function that returns random numbers uniformly distributed between 0 and 1.

Algorithm 2.1: Pseudocode for recursive program generation with the full and grow methods.

Because neither the grow or full method provide a very wide array of sizes or shapes on their own, Koza ( 1992) proposed a combination called ramped half-and-half. Half the initial population is constructed using full and half is constructed using grow. This is done using a range of depth limits (hence the term "ramped") to help ensure that we generate trees having a variety of sizes and shapes.

While these methods are easy to implement and use, they often make it difficult to control the statistical distributions of important properties such as the sizes and shapes of the generated trees. For example, the sizes and shapes of the trees generated via the grow method are highly sensitive to the sizes of the function and terminal sets. If, for example, one has significantly more terminals than functions, the grow method will almost always generate very short trees regardless of the depth limit. Similarly, if the number of functions is considerably greater than the number of terminals, then the grow method will behave quite similarly to the full method. The arities of the functions in the primitive set also influence the size and shape of the trees produced by grow.3 Section  5.1 (page  98 ) describes other initialisation mechanisms which address these issues.

The initial population need not be entirely random. If something is known about likely properties of the desired solution, trees having these properties can be used to seed the initial population. This, too, will be described in Section  5.1 .

Contents Top Previous Next