Dimensionality reduction: Isomap

Isomap is one of the “oldest” tools for dimensionality reduction. It aims at reproducing geodesic distances (geodesic distances are a property of Riemanian manifolds) on the manifold in an Euclidiean space.

To compute the approximated geodesic distances, a graph is created, an edge linking two close points (K-neighboors or Parzen windows can be used to choose the closest points) with its weight being the Euclidean distance between them. Then, a square matrix is computed with the shortest path between two points with a Dijkstra or Floyd-Warshall algorithm. This follows some distance and Riemanian manifolds properties. The number of points is generally chosen based on the estimated distance on the manifold.

Finally, an classical MDS procedure is performed to get a set of coordinates.
Continue reading Dimensionality reduction: Isomap

More on manifold learning

I hope to present here some result in February, but I’ll expose what I’ve implemented so far :

  • Isomap
  • LLE
  • Laplacian Eigenmaps
  • Hessian Eigenmaps
  • Diffusion Maps (in fact a variation of Laplacian Eigenmaps)
  • Curvilinear Component Analysis (the reduction part)
  • NonLinear Mapping (Sammon)
  • My own technique (reduction, regression and projection)
  • PCA (usual reduction, but robust projection with an a priori term)

The results I will show here are mainly reduction comparison between the techniques, knowing that each technique has a specific field of application : LLE is not made to respect the geodesic distances, Isomap, NLM and my technique are.

Buy Me a Coffee!
Other Amount:
Your Email Address:

A new French book on scientific computing with Python

Today ships my first book on Python for the scientists. Although IT people can learn a lot of Python with it (mainly if they are working in labs are research centers), scientists will be more interested as it presents a viable alternative to Matlab : fast, efficient, a real language with a large standard library.

After an introduction, the Python language is exposed as well as some main modules. The three central chapters are dedicated to Numpy, Scipy and Matplotlib. Each library tackles a specific problem, storing data, using it or display it. Finally, the last chapter exposes ways of speeding up Python with the use of C or C++.

The link to my publisher : here

Transforming a C++ vector into a Numpy array

This question was asked on the Scipy mailing-list last year (well, one week ago). Nathan Bell proposed a skeleton that I used to create an out typemap for SWIG.

  1. %typemap(out) std::vector<double> {
  2.     int length = $1.size();
  3.     $result = PyArray_FromDims(1, &amp;length, PyArray_DOUBLE);
  4.     memcpy(PyArray_DATA($result),&amp;((*(&amp;$1))[0]),sizeof(double)*length);
  5. }
  6. </double>

This typemap uses obviously Numpy, so don’t forget to initialize the module and to import it. Then there is a strange instruction in memcpy. &((*(&$1))[0]) takes the address of the array of the vector, but as it is wrapped by SWIG, one has to get to the std::vector by dereferencing the SWIG wrapper. Then one can get the first element in the vector and take the address.