It has been a while since my last post on manifold learning, and I still have some things to speak about (unfortunately, it will be the end post of the dimensionality reduction series on my blog, as my current job is not about this anymore). After the multidimensional regression, it is possible to use it to project new samples on the modelized manifold, and to classify data.
Once the data set is reduced (see my first posts if you’re jumping on the bandwagon), there are several ways of mapping this reduced space to the original space:
- you can interpolate the data in the original space based on an interpolation in the reduced space, or
- you create an approximation of the mapping with a multidimensional function (B-splines, …)
When using the first solution, if you map one of the reduced point used for the training, you get the original point. With the second solution, you get a close point. If the data set you have is noisy you should use the second solution, not the first. And if you are trying to compress data (lossly compression), you can not use the first one, as you need the original points to get new interpolated points, so you are not compressing your data set.
The solution I propose is based on approximation with a set of piecewise linear models (each model being a mapping between a subspace of the reduced space to the original space). At the boundaries between the models, I do not assert continuity, contrary to hinging hyperplanes. Contrary to Projection Pursuit Regression and hinging hyperplane, my mapping is between the two spaces, and not from the reduced space to one coordinate in the original space. This will enable projection on the manifold (which is another subject that will be discussed in another post).
I hope to present here some result in February, but I’ll expose what I’ve implemented so far :
- Laplacian Eigenmaps
- Hessian Eigenmaps
- Diffusion Maps (in fact a variation of Laplacian Eigenmaps)
- Curvilinear Component Analysis (the reduction part)
- NonLinear Mapping (Sammon)
- My own technique (reduction, regression and projection)
- PCA (usual reduction, but robust projection with an a priori term)
The results I will show here are mainly reduction comparison between the techniques, knowing that each technique has a specific field of application : LLE is not made to respect the geodesic distances, Isomap, NLM and my technique are.
As I approach the end of my PhD, I will propose my manifold learning code in a scikit (see this page) in a few weeks. For the moment, I don’t know which scikit will be used, but stay put…
The content of the scikit will be :
- Laplacian eigenmaps
- Diffusion maps