Although I’m fond of numerical optimization through gradients, … there are some times where a global optimization is much more powerfull. For instance, I have to generate two sequences/combs that are orthogonal and for which their autocorrelation is almost an impulse. The two combs have a fixed number of impulse, so it’s a perfect job for genetic algorithms.
Genetic algorithms are a global optimization technique. The parameters are encoded in a genome, and then different populations are grown. The fittest individuals survive and give new individuals. The usual implementation in Python is PyEvolve, a pure Python package that isn’t depended on anything except if you want to save and plot populations.
Creating a genome and a cost function
So my goal is to minimize crosscorrelation and autocorrelation, safe for the middle value of autocorrelation. I won’t use a fixed-size sequence, I will encode in the genome the distance between two impulses. This will allow for a really small genome with a lot of possibilities.
For this purpose, I create a class named OrthogonalSequences:
class OrthogonalSequences(object):
def __init__(self, size, number):
self.size = size
self.number = number
self.genome_length = size * number
@staticmethod
def convert_slice_to_time(chromosome):
""""
Converts a chromosome in a time sequence
""""
total = numpy.cumsum(chromosome)
array = numpy.zeros(total[-1]+1)
array[total] = 1
array[0] = 1
return array
def evaluate(self, chromosomes):
""""
Evaluate the cost (square of error) of the genome by evaluating cross- and autocorrelation.
""""
value = 0
for i in range(self.number):
slice_i = self.convert_slice_to_time(chromosomes[i * self.size:(i+1) * self.size])
value += .1 * len(slice_i)
for j in range(i, self.number):
slice_j = self.convert_slice_to_time(chromosomes[j * self.size:(j+1) * self.size])
correlation = numpy.correlate(slice_i, slice_j, mode = "full")
if (i == j):
correlation[len(slice_i)-1] -= self.size
value += numpy.sum(correlation ** 2)
return value
In fact this class will only be used to evaluate a genome. As I have two combs (and more if you’d like), I need to pass this information to the evaluation function, and thus nothing is more easy than Python for this. I’ve also added a small factor that adds a cost for lengthy combs.
Setting up the optimizer
Now, the only thing remaining is to set all objects and launch the optimization:
if __name__ == "__main__":
from pyevolve import G1DList, GSimpleGA, Consts
import sys
size = int(sys.argv[1]) if len(sys.argv) > 1 else 10
size -= 1
number = int(sys.argv[2]) if len(sys.argv) > 2 else 2
rangemax = int(sys.argv[3]) if len(sys.argv) > 3 else 100
generations = int(sys.argv[4]) if len(sys.argv) > 4 else 50
population = int(sys.argv[5]) if len(sys.argv) > 5 else 100
genome = G1DList.G1DList(number*size)
sequences = OrthogonalSequences(size, number)
ga = GSimpleGA.GSimpleGA(genome)
genome.setParams(rangemin=1, rangemax=rangemax)
genome.evaluator.set(sequences.evaluate)
ga.setMinimax(Consts.minimaxType["minimize"])
ga.setPopulationSize(population)
ga.setElitism(True)
ga.setGenerations(generations)
ga.evolve(freq_stats = generations/10)
print ga.bestIndividual()
Results
Now the result is the following for 2 sequences of 10 impulses:
Gen. 1 (2.00%): Max/Min/Avg Fitness(Raw) [436.73(387.20)/317.81(349.20)/363.94(363.94)] Gen. 5 (10.00%): Max/Min/Avg Fitness(Raw) [417.89(375.20)/340.39(345.20)/348.24(348.24)] Gen. 10 (20.00%): Max/Min/Avg Fitness(Raw) [419.21(379.20)/339.65(345.20)/349.34(349.34)] Gen. 15 (30.00%): Max/Min/Avg Fitness(Raw) [417.72(377.20)/341.16(345.20)/348.10(348.10)] Gen. 20 (40.00%): Max/Min/Avg Fitness(Raw) [419.40(375.20)/337.80(345.20)/349.50(349.50)] Gen. 25 (50.00%): Max/Min/Avg Fitness(Raw) [419.86(389.20)/341.55(345.20)/349.88(349.88)] Gen. 30 (60.00%): Max/Min/Avg Fitness(Raw) [418.25(381.20)/341.41(345.20)/348.54(348.54)] Gen. 35 (70.00%): Max/Min/Avg Fitness(Raw) [418.08(375.20)/340.08(345.20)/348.40(348.40)] Gen. 40 (80.00%): Max/Min/Avg Fitness(Raw) [418.42(375.20)/339.53(345.20)/348.68(348.68)] Gen. 45 (90.00%): Max/Min/Avg Fitness(Raw) [418.80(381.20)/340.76(345.20)/349.00(349.00)] Gen. 50 (100.00%): Max/Min/Avg Fitness(Raw) [417.94(371.20)/338.92(345.20)/348.28(348.28)] Total time elapsed: 2.478 seconds. - GenomeBase Score: 345.200000 Fitness: 338.919595 Slot [Evaluator] (Count: 1) Name: evaluate Slot [Initializator] (Count: 1) Name: G1DListInitializatorInteger Doc: Integer initialization function of G1DList This initializator accepts the *rangemin* and *rangemax* genome parameters. Slot [Mutator] (Count: 1) Name: G1DListMutatorSwap Doc: The mutator of G1DList, Swap Mutator Slot [Crossover] (Count: 1) Name: G1DListCrossoverSinglePoint Doc: The crossover of G1DList, Single Point .. warning:: You can't use this crossover method for lists with just one element. - G1DList List size: 18 List: [36, 48, 1, 11, 95, 37, 34, 38, 25, 10, 29, 64, 19, 11, 62, 68, 7, 15]
Here are the visual results for this optimization:
![]() |
![]() |
![]() |
They are not perfect, far from it, but remember that they are only combs with positive impulses. This means that you can’t get a perfect result. An almost-perfect result would be that the crosscorrelation is below 1, but with only one value of 2, it’s still pretty good.
Before the end
The process is very fast for 2 sequences of 10 impulses. If you add more sequences and more impulses, the computation time will rise fastly, but PyEvolve has now the ability to use the multiprocessing module, which means it can do parallel optimizations now.
PyEvolve can do much more, I only barely scratch the surface of its capablities, but I hope this example will make you try this package.
You might also be interested in the PaGMO project at the European Space Agency, it has Python bindings. It allows for parallel optimisation with genetic algorithms using the island model.
http://sourceforge.net/apps/mediawiki/pagmo/index.php?title=Main_Page
Interesting post. I wonder how easy it would be to do the same with the (old and perhaps somewhat limited) Genetic Algorithm code in Biopython? We’d like someone to adopt the Bio.GA code ideally…
http://biopython.org/DIST/docs/api/Bio.GA-module.html