### Genetic algorithms in Python

Although I’m fond of numerical optimization through gradients, … there are some times where a global optimization is much more powerfull. For instance, I have to generate two sequences/combs that are orthogonal and for which their autocorrelation is almost an impulse. The two combs have a fixed number of impulse, so it’s a perfect job for genetic algorithms.

Genetic algorithms are a global optimization technique. The parameters are encoded in a genome, and then different populations are grown. The fittest individuals survive and give new individuals. The usual implementation in Python is PyEvolve, a pure Python package that isn’t depended on anything except if you want to save and plot populations.

#### Creating a genome and a cost function

So my goal is to minimize crosscorrelation and autocorrelation, safe for the middle value of autocorrelation. I won’t use a fixed-size sequence, I will encode in the genome the distance between two impulses. This will allow for a really small genome with a lot of possibilities.

For this purpose, I create a class named OrthogonalSequences:

```class OrthogonalSequences(object):
def __init__(self, size, number):
self.size = size
self.number = number
self.genome_length = size * number

@staticmethod
def convert_slice_to_time(chromosome):
""""
Converts a chromosome in a time sequence
""""
total = numpy.cumsum(chromosome)
array = numpy.zeros(total[-1]+1)
array[total] = 1
array = 1
return array

def evaluate(self, chromosomes):
""""
Evaluate the cost (square of error) of the genome by evaluating cross- and autocorrelation.
""""
value = 0
for i in range(self.number):
slice_i = self.convert_slice_to_time(chromosomes[i * self.size:(i+1) * self.size])
value += .1 * len(slice_i)
for j in range(i, self.number):
slice_j = self.convert_slice_to_time(chromosomes[j * self.size:(j+1) * self.size])
correlation = numpy.correlate(slice_i, slice_j, mode = "full")
if (i == j):
correlation[len(slice_i)-1] -= self.size

value += numpy.sum(correlation ** 2)

return value
```

In fact this class will only be used to evaluate a genome. As I have two combs (and more if you’d like), I need to pass this information to the evaluation function, and thus nothing is more easy than Python for this. I’ve also added a small factor that adds a cost for lengthy combs.

#### Setting up the optimizer

Now, the only thing remaining is to set all objects and launch the optimization:

```if __name__ == "__main__":
from pyevolve import G1DList, GSimpleGA, Consts
import sys

size = int(sys.argv) if len(sys.argv) > 1 else 10
size -= 1
number = int(sys.argv) if len(sys.argv) > 2 else 2
rangemax = int(sys.argv) if len(sys.argv) > 3 else 100
generations = int(sys.argv) if len(sys.argv) > 4 else 50
population = int(sys.argv) if len(sys.argv) > 5 else 100

genome = G1DList.G1DList(number*size)
sequences = OrthogonalSequences(size, number)
ga = GSimpleGA.GSimpleGA(genome)

genome.setParams(rangemin=1, rangemax=rangemax)
genome.evaluator.set(sequences.evaluate)

ga.setMinimax(Consts.minimaxType["minimize"])

ga.setPopulationSize(population)
ga.setElitism(True)
ga.setGenerations(generations)
ga.evolve(freq_stats = generations/10)
print ga.bestIndividual()
```

#### Results

Now the result is the following for 2 sequences of 10 impulses:

```Gen. 1 (2.00%): Max/Min/Avg Fitness(Raw) [436.73(387.20)/317.81(349.20)/363.94(363.94)]
Gen. 5 (10.00%): Max/Min/Avg Fitness(Raw) [417.89(375.20)/340.39(345.20)/348.24(348.24)]
Gen. 10 (20.00%): Max/Min/Avg Fitness(Raw) [419.21(379.20)/339.65(345.20)/349.34(349.34)]
Gen. 15 (30.00%): Max/Min/Avg Fitness(Raw) [417.72(377.20)/341.16(345.20)/348.10(348.10)]
Gen. 20 (40.00%): Max/Min/Avg Fitness(Raw) [419.40(375.20)/337.80(345.20)/349.50(349.50)]
Gen. 25 (50.00%): Max/Min/Avg Fitness(Raw) [419.86(389.20)/341.55(345.20)/349.88(349.88)]
Gen. 30 (60.00%): Max/Min/Avg Fitness(Raw) [418.25(381.20)/341.41(345.20)/348.54(348.54)]
Gen. 35 (70.00%): Max/Min/Avg Fitness(Raw) [418.08(375.20)/340.08(345.20)/348.40(348.40)]
Gen. 40 (80.00%): Max/Min/Avg Fitness(Raw) [418.42(375.20)/339.53(345.20)/348.68(348.68)]
Gen. 45 (90.00%): Max/Min/Avg Fitness(Raw) [418.80(381.20)/340.76(345.20)/349.00(349.00)]
Gen. 50 (100.00%): Max/Min/Avg Fitness(Raw) [417.94(371.20)/338.92(345.20)/348.28(348.28)]
Total time elapsed: 2.478 seconds.
- GenomeBase
Score:                   345.200000
Fitness:                 338.919595

Slot [Evaluator] (Count: 1)
Name: evaluate
Slot [Initializator] (Count: 1)
Name: G1DListInitializatorInteger
Doc:  Integer initialization function of G1DList

This initializator accepts the *rangemin* and *rangemax* genome parameters.

Slot [Mutator] (Count: 1)
Name: G1DListMutatorSwap
Doc:  The mutator of G1DList, Swap Mutator
Slot [Crossover] (Count: 1)
Name: G1DListCrossoverSinglePoint
Doc:  The crossover of G1DList, Single Point

.. warning:: You can't use this crossover method for lists with just one element.

- G1DList
List size:       18
List:            [36, 48, 1, 11, 95, 37, 34, 38, 25, 10, 29, 64, 19, 11, 62, 68, 7, 15]
```

Here are the visual results for this optimization:

They are not perfect, far from it, but remember that they are only combs with positive impulses. This means that you can’t get a perfect result. An almost-perfect result would be that the crosscorrelation is below 1, but with only one value of 2, it’s still pretty good.

#### Before the end

The process is very fast for 2 sequences of 10 impulses. If you add more sequences and more impulses, the computation time will rise fastly, but PyEvolve has now the ability to use the multiprocessing module, which means it can do parallel optimizations now.

PyEvolve can do much more, I only barely scratch the surface of its capablities, but I hope this example will make you try this package.

## 2 thoughts on “`Genetic algorithms in Python`”

1. John G says:

You might also be interested in the PaGMO project at the European Space Agency, it has Python bindings. It allows for parallel optimisation with genetic algorithms using the island model.

http://sourceforge.net/apps/mediawiki/pagmo/index.php?title=Main_Page

2. Peter says:

Interesting post. I wonder how easy it would be to do the same with the (old and perhaps somewhat limited) Genetic Algorithm code in Biopython? We’d like someone to adopt the Bio.GA code ideally…
http://biopython.org/DIST/docs/api/Bio.GA-module.html

This site uses Akismet to reduce spam. Learn how your comment data is processed.