### Dimensionality reduction: Isomap

(2 votes, average: 5.00 out of 5)

Isomap is one of the “oldest” tools for dimensionality reduction. It aims at reproducing geodesic distances (geodesic distances are a property of Riemanian manifolds) on the manifold in an Euclidiean space.

To compute the approximated geodesic distances, a graph is created, an edge linking two close points (K-neighboors or Parzen windows can be used to choose the closest points) with its weight being the Euclidean distance between them. Then, a square matrix is computed with the shortest path between two points with a Dijkstra or Floyd-Warshall algorithm. This follows some distance and Riemanian manifolds properties. The number of points is generally chosen based on the estimated distance on the manifold.

Finally, an classical MDS procedure is performed to get a set of coordinates.

### More on manifold learning

(1 votes, average: 5.00 out of 5)

I hope to present here some result in February, but I’ll expose what I’ve implemented so far :

• Isomap
• LLE
• Laplacian Eigenmaps
• Hessian Eigenmaps
• Diffusion Maps (in fact a variation of Laplacian Eigenmaps)
• Curvilinear Component Analysis (the reduction part)
• NonLinear Mapping (Sammon)
• My own technique (reduction, regression and projection)
• PCA (usual reduction, but robust projection with an a priori term)

The results I will show here are mainly reduction comparison between the techniques, knowing that each technique has a specific field of application : LLE is not made to respect the geodesic distances, Isomap, NLM and my technique are.

### A new French book on scientific computing with Python

(No Ratings Yet)

Today ships my first book on Python for the scientists. Although IT people can learn a lot of Python with it (mainly if they are working in labs are research centers), scientists will be more interested as it presents a viable alternative to Matlab : fast, efficient, a real language with a large standard library.

After an introduction, the Python language is exposed as well as some main modules. The three central chapters are dedicated to Numpy, Scipy and Matplotlib. Each library tackles a specific problem, storing data, using it or display it. Finally, the last chapter exposes ways of speeding up Python with the use of C or C++.

The link to my publisher : here

### Transforming a C++ vector into a Numpy array

(3 votes, average: 2.33 out of 5)

This question was asked on the Scipy mailing-list last year (well, one week ago). Nathan Bell proposed a skeleton that I used to create an out typemap for SWIG.

1. ```%typemap(out) std::vector<double> {
```
2. ```    int length = \$1.size();
```
3. ```    \$result = PyArray_FromDims(1, &amp;length, NPY_DOUBLE);
```
4. ```    memcpy(PyArray_DATA((PyArrayObject*)\$result),&amp;((*(&amp;\$1))[0]),sizeof(double)*length);
```
5. `}`

This typemap uses obviously Numpy, so don’t forget to initialize the module and to import it. Then there is a strange instruction in memcpy. &((*(&\$1))[0]) takes the address of the array of the vector, but as it is wrapped by SWIG, one has to get to the std::vector by dereferencing the SWIG wrapper. Then one can get the first element in the vector and take the address.

Edit on May 2017: This is my most recent trials with this.

1. ```%typemap(out) std::vector<float> {
```
2. ```    npy_intp length = \$1.size();
```
3. ```    \$result = PyArray_SimpleNew(1, &amp;length, NPY_FLOAT);
```
4. ```    memcpy(PyArray_DATA((PyArrayObject*)\$result),\$1.data(),sizeof(float)*length);
```
5. `}`

### Creating a Python module with Scons and SWIG

(No Ratings Yet)

Some times ago, I proposed an optional build for SWIG if the SWIG binary was not found on the system. Here I propose an enhancement, a new library builder that will be registered in the environment env as PythonModule. It takes the same arguments as a classical SharedLibrary, but it does some additional steps :

• It forces SWIG to create a Python wrapper (flag -python)
• It checks if SWIG is present at all
• It suppresses every prefix that the system might need (as lib in Linux)
• On Windows and for Python >= 2.5, it changes the extension as pyd

### Manifold learning toolbox for Python

(No Ratings Yet)

As I approach the end of my PhD, I will propose my manifold learning code in a scikit (see this page) in a few weeks. For the moment, I don’t know which scikit will be used, but stay put…

The content of the scikit will be :

• Isomap
• LLE
• Laplacian eigenmaps
• Diffusion maps

### Enabling thread support in SWIG and Python

(4 votes, average: 4.00 out of 5)

I was looking for some days in SWIG documentation how I could release the GIL (Global Interpreter Lock) with SWIG. There were some macros defined in the generated code, but none was used in any place.

In fact, I just had to enable the thread support with an additional argument (-threads) and now every wrapped function releases the GIL before it is called, but that does not satisfy me. Indeed, some of my wrappers must retain the GIL while they are used (see this item). So here are the features that can be used :

• nothread enables or disables the whole thread lock for a function :
• nothreadblock enables or disables the block thread lock for a function :
• nothreadallow enables or disables the allow thread lock for a function :

When the whole thread lock is enabled, the GIL is locked when entering the C function (with the macro SWIG_PYTHON_THREAD_BEGIN_BLOCK). Then it is released before the call to the function (with SWIG_PYTHON_THREAD_BEGIN_ALLOW), retained after the end (SWIG_PYTHON_THREAD_END_ALLOW) and finally it is released when exiting the function (SWIG_PYTHON_THREAD_END_BLOCK), after all Python result variables are created and/or modified.

### Using Scons to create Python modules with Visual Studio 2005

(No Ratings Yet)

Starting from Visual Studio 2005, every executable or dynamic library must declare the libraries it uses with a manifest file. This manifest can be embedded in the executable or library, and this is the best way to deal with it.

When using Scons, this embedding does not occur automatically. One has to overload the SharedLibrary builder so that a post-action is made after building the library :

```def MSVCSharedLibrary(env, library, sources, **args):
cat=env.OriginalSharedLibrary(library, sources, **args)
env.AddPostAction(cat, 'mt.exe -nologo -manifest \${TARGET}.manifest -outputresource:\$TARGET;2')
return cat

env['BUILDERS']['OriginalSharedLibrary'] = env['BUILDERS']['SharedLibrary']
env['BUILDERS']['SharedLibrary'] = MSVCSharedLibrary```

With this method, the embedding is made for every library, which is handy. The same can be done for the Program builder with the line :

`  env.AddPostAction(cat, 'mt.exe -nologo -manifest \${TARGET}.manifest -outputresource:\$TARGET;1')`

### Wrapping a C++ container in Python

(No Ratings Yet)

When moving to Python, the real big problem that arises is the transformation of a Python array into the C++ container the team used for years.

Let’s set some hypothesis :

• there is a separation between the class containing the data and the class that uses the data (iterators, …)
• the containing class can be changed (policy or strategy pattern)

The first hypothesis is derived from the responsibility principle, the two classes have two distinct responsibilities, the first allocates the data space and allows simple access to it, the second allows usual operations (assignation, comparison tests or iterations for instance).

The second one will be the heart of the wrapper. It allows to change the way data is stored and accessed in a simple way.

(No Ratings Yet)