After some general books on grid computation, I needed to change the subject of my readings a little bit. As Intel Threading Building Blocks always intrigued me, I chose the associated book.
My manifold learning code was for some time a Technology Preview in the scikit learn. Now I can say that it is available (BSD license) and there should not be any obvious bug left..
I’ve written a small tutorial. It is not an usual tutorial (there is a user tutorial and then what developers should know to enhance it), and some results of the techniques are exposed in my blog. It provides the basic commands to start using the scikit yourself (reducing some data, projecting new points, …) as well as the expoed interface to enhance the scikit.
If you have any question, feel free to ask me, I will add the answers to the tutorial page so that everyone can benefit from it.
Be free to contribute new techniques and additional tools as well, I cannot write them all ! For instance, the scikit lacks some robust neighbors selection to avoid short-cuts in the manifold…
Peer-to-peer. These words are unleashing in France a fight between the legislators and the developers. And this old – I say old because it was written in 2001, and 7 years is old for a book on this topic – book presented me the issues debated in journals, blogs, … in a new way.
At last, my article on manifold learning has been published and is accessible with doi.org (it was not the case last week, that’s why I waited before publishing this post).
The journal is free, so you won’t have to pay to read it : Access to the EURASIP JASP article
I will publish additional figures here in a short time. The scikit is almost completed as well, I’m finishing the online tutorial for those who are interested in using it and/or enhancing it.
Today, I’m publishing a tutorial on two C++ profilers on my French website. The question I’m asking myself and you is: should I translate it ?
If some of you are interested in my French tutorials, I may translate them from time to time, depending on their content (I don’t want to translate an article on Boost for instance, the documentation does provide everything). But I’ll do that only if people tell me “Go on”. So I’m all ears…
Since the beginning of this year, I was trying to figure out what to do in my future. I’m still doing my PhD, but what could I do after that ?
My current job is to find a model for datasets.
A lot of datasets can be explained by a small number of parameters. For instance identity photos of a single person can be explained by 3 translations and 3 rotations. So my algorithms did that: find the parameters (or something that is close enough) and create a mapping between the parameters and the original space.
During this research, I learnt what is scientific computing. I did not explore everything in this field, but I covered the basics. That’s where I found about Python, but also C++ (which is the first language I really used). My thirst for information lead me to read a lot of books on several matters (architectural design, process, but also parallel computing and its different flavors). This led me to search for a job that would interest me the most.
So starting from September I’ll move to Pau, a town in the South of France. This is where the biggest research center of Total S.A. is located. I will work on oil exploration.
Although the theory behind this are well known (acoustic wave propagation and inverse problem), this does not mean that research in this field is over. For instance, the power needed for solving these problems are enormous. So their implementation must be well thought. And even if you managed to find a solution to your problem, you are not done. Total’s goal is not to be able to see if acoustic waves propagate fast in some places and slowly in others. Its goal is to find oil and gas. So now that one has an acoustic model, one must see with the geologists if there are some odds that there is oil or gas. And that’s also a big interesting challenge.
For those who were interested in manifold learning, don’t worry, I’m not finished in exposing my research. I will go on with some new posts about the mapping between the two spaces and how it can be used to test new samples. The scikit is now almost available. I still have to finish the tutorial and test if everything is OK.
I hope I will be able to continue with other subjects on this blog, there is no reason I cannot do this. Although what I’ll be doing at Total is secret, there are a lot of fields I’d like to talk about.
I was looking for an introductory book on peer-to-peer (P2P) application and their application to grid computation. Web services was a bonus, as it is something I don’t usually play with.
I’ve noticed some days ago that I mainly used one design pattern in my scientific (but not only) code, the registry. How does it work? A registry is a list/dictionary/… of objects, applications add a new entry if it is needed, and then a user can tap into the registry to find the most adequate object for one’s purpose.
This book is different from the two last books I read. Indeed, it tackles a specific Python library, Twisted, and how to use it.
I’ve already given some answers in one of my first tickets on manifold learning. Here I will give some more complete results on the quality of the dimensionality reduction performed by the most well-known techniques.
First of all, my test is about respecting the geodesic distances in the reduced space. This is not possible for some manifolds like a Gaussian 2D plot. I used the SCurve to create the test, as the speed on the curve is unitary and thus the distances in the coordinate space (the one I used to create the SCurve) are the same as the geodesic ones on the manifold. My test measures the matrix (Frobenius) norm between the original coordinates and the computed one up to an affine transform of the latter.