Open Source software seems for the young generation as sure as the sun rises. And even if I witnessed the emergence of Open Source, I more often than not forget that there was a time when Linux didn’t exist. This recent history brought us a lot, but we may only have handpicked some of this revolution’s fruit. Eric Raymond is one of the guys behind this revolution, and he took some time to think about the changes it brought.
I’m please to announce a new version for scikits.optimization. The main focus of this iteration was to finish usual unconstrained optimization algorithms.
- Fixes on the Simplex state implementation
- Added several Quasi-Newton steps (BFGS, rank 1 update…)
The scikit can be installed with pip/easy_install or downloaded from PyPI
It has been a while, too long for sure, since my last update on this scikit. I’m pleased to announce that some algorithms are finally fixed as well as some tests.
- Fixed Polytope/Simplex/Nelder-Mead
- Fixed the Quadratic Hessian helper class
Additional tutorials will be available in the next weeks.
Yes, because Cover Trees are sometimes too slow. In fact, I asked myself this question, not for the build time, but for the search time if the data has a structure. Imagine, what would happen if your data was more a less a regular grid? When I tried that, starting with a point at (0,0), then (1,0)… the first node (0,0) had references to all the last points (9,9), (9,8)… And I figured, it would be slower than a tree search. So I decided to give kd-trees a shot for this kind of search on a regular grid.
I had to port a simplex/Nelder-Mead optimizer that I already have in Python in C++. As for the Python version, I tried to be as generic as possible but as efficient as possible, so the state is no longer a dictionary, but a simple structure.
I could have used the Numerical Recipes version, but the licence cost is not worth it, and the code is not generic enough, not explained enough. And also there are some design decisions that are questionable (one method = one responsibility).
I’ve looked on github for a good C++ implementation of Cover Trees for nearest-neighbors search, but I didn’t find one. I may have overlooked some repositories, but in the end, implementing it myself wasn’t that difficult.
When faced with a new dataset, the issue is to find how it should be analyzed. A lot of books addresses the theoretical way of doing it, but this book gives practical clues to do it. Besides, it isn’t based on commercial tools like MATLAB, but on open source tools that can be freely downloaded on the Internet.
I’ve decided for once to read a novel about software. This book is about the story of Chandler, a piece of software that was a dream that didn’t quite came true.
Profiling comes in three different flaviors. The first is emulation, where a processor behavior is emulated, the second is sampling, where at regular intervals, the profiler samples the status of a program, and fianlly instrulentation, where the profiler gets information when a subroutine is called and when it returns. As with the Heisenberg uncertainty, profiling changes the exact behavior of your program. This is something you have to remember when analyzing a profile.
Valgrind is an Open Source emulation profiler. It is freely available on standard Linux platforms. As it is an emulation, it is far slower than the actual program. This means that the I/O are underestimated. The advantage is that you can have every detail on the memory behavior (cache misses for instance). Valgrind does not emulate all processors, but you can tweak it to approach your own one.