Contents Top Previous Next

10.5 Geographically Distributed GP

Unless some type of synchronisation is imposed, the parallel forms of GP in which different parts of a population are evolved by different processing elements will not be running the same algorithm as the standard single-CPU version of GP. Therefore, almost certainly, different parallelisations will produce different answers. However, as we discussed in Section  10.3 , this is not necessarily a bad thing.

Parallelisation itself can bring benefits similar to those hypothesised in natural populations by ?. In particular, the population is often divided into semi-independent sub-populations called demes (Collins1992D'haeseleer and Bluming1994Langdon1998Popovici and De Jong2006). The flow of genetic material between demes is restricted by limiting the exchange of individuals between them. The limit can be on the number of individuals that are allowed to migrate per generation. Alternatively the demes may be considered to be arranged in a "geographical" topology that constrains which demes can trade individuals. For example, it may be that with limited migration between compute nodes, the evolved populations on adjacent nodes will diverge, and that this increased diversity may lead to better solutions. Fernandez, Tomassini, and Vanneschi (2003), for example, report that distributing individuals between subpopulations offers an advantage in terms of quality of solutions and computational effort.

When Koza first started using GP on a network of Transputers (Andre and Koza1996), Andre experimentally determined the best migration rate for their problem. He suggested Transputers arranged in an asynchronous 2-D toroidal square grid (such as the one in Figure  10.2 a) should exchange 2% of their population with their four neighbours.

PIC      PIC
(a)      (b)
Figure 10.2: Spatially structured GP populations. (a) Toroidal grid of demes, where each deme (a node) contains a sub-population, and demes periodically exchange a small group of high-fitness individuals using a grid of communication channels. (b) Fine-grained distributed GP, where each grid cell contains one individual and where the selection of a mating partner for the individual in the centre cell is performed by executing a tournament among randomly selected individuals (e.g., the individuals shaded) in its 3 × 3 neighbourhood.

Densely connected grids have been widely adopted in parallel GP. Usually they allow innovative partial solutions to spread quickly. However, the GA community reported better results from less connected topologies, such as arranging the compute nodes' populations in a ring, so that they could transfer genes only between themselves and their two neighbours (Stender1993). Potter ( 1997) argues in favour of spatial separation in populations and fine-grained distributed forms of GP (see Figure  10.2 b). ? gives some guidance on parallel genetic algorithms.

While many have looked enviously at Koza's 1000 node Beowulf cluster (Sterling1998) and other supercomputer realisations of GP (Bennett, Koza, Shipman, and Stiffelman1999Juille and Pollack1996), a supercomputer is often not necessary. Many businesses and research centres leave computers permanently switched on. During the night their computational resources tend to be wasted. This computing power can easily and efficiently be used to execute distributed GP runs overnight. Typically, GP does not demand a high performance bus to interconnect the compute nodes, and so existing office Ethernet networks are often sufficient. While parallel GP systems can be implemented using MPI (?) or PVM (Fernandez, Sanchez, Tomassini, and Gomez1999), the use of such tools is not necessary: simple Unix commands and port-to-port HTTP is sufficient (Poli, Page, and Langdon1999). The population can be split and stored on modest computers. With only infrequent interchange of parts of the population or fitness values little bandwidth is needed. Indeed a global population spread via the Internet (Chong and Langdon1999Draves2006Klein and Spector2007Langdon2005a), ą la seti@home, is perfectly feasible (see Figure  10.3 ).

Other parallel GPs include (Cheang, Leung, and Lee2006Folino, Pizzuti, and Spezzano2003Gustafson and Burke2006Klein and Spector2007Tanev, Uozumi, and Akhmetov2004).


Figure 10.3: A globally distributed GP system (Langdon2005a). The server is the centre of the star architecture, with the lines connecting it to users around the world. The users evolved snowflake patterns using a continuously evolving L-System, and their (subjective) preferences provided the fitness measure used to drive the system.

Contents Top Previous Next