Sometimes, a C or C++ array structure must be used in Python, and it’s always better to be able to use the underlying array to do some Numpy computations. To that purpose, Numpy proposes the array interface.

I will now expose an efficient way to use SWIG to generate the array interface and exposing the __array_struct__ property.

Read More

I’m trying to use the MKL with some programs and libraries, but I encountered something really strange and I’m not alone.
First, what is i_free ? Accoding to Intel, it’s their way to handle memory allocation and deallocation. They are only pointers to the actual memory functions so as to let the user decide if he wants a custom memory handler. Since 10.0.3, Intel changed their model, and the trouble begins.

Read More

As I have to parellize some programs developed in my new lab, I monitor CPU usage during thier execution. I do not usually need MPI to optimize them (although sometimes it is needed), only OpenMP, which means I can track /proc/ to get CPU and instantaneously memory usages.

So I wrote a small script that can be used by anyone for this purpose. I’ll explain how it works now.

Read More

My manifold learning code was for some time a Technology Preview in the scikit learn. Now I can say that it is available (BSD license) and there should not be any obvious bug left..

I’ve written a small tutorial. It is not an usual tutorial (there is a user tutorial and then what developers should know to enhance it), and some results of the techniques are exposed in my blog. It provides the basic commands to start using the scikit yourself (reducing some data, projecting new points, …) as well as the expoed interface to enhance the scikit.

If you have any question, feel free to ask me, I will add the answers to the tutorial page so that everyone can benefit from it.

Be free to contribute new techniques and additional tools as well, I cannot write them all ! For instance, the scikit lacks some robust neighbors selection to avoid short-cuts in the manifold…

Tutorial and the learn scikit mainpage.

Buy Me a Coffee!
Other Amount:
Your Email Address:

My favorite design pattern in Python

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 1.00 out of 5)
Loading...

I’ve noticed some days ago that I mainly used one design pattern in my scientific (but not only) code, the registry. How does it work? A registry is a list/dictionary/… of objects, applications add a new entry if it is needed, and then a user can tap into the registry to find the most adequate object for one’s purpose.

Read More

In March 2008 issue, IEEE Computers published a case study on large-scale parallel scientific code development. I’d like to comment this article, a very good one in my mind.

Five research centers were analyzed, or more precisely their development tool and process. Each center did a research in a peculiar domain, but they seem share some Computational Fluid Dynamics basis.

Read More

In my lab, we frequently process huge amounts of data, each process can take hours or days. The problem is that we don’t have a usable tool to do this.

Our legacy software is in C and we plan on moving to Python in the next weeks. We could use some commercial software, but it is not optimal.

This is where P2P comes into the game. We have a lot of unused computers or dual cores that are not used even at 50% because we are not trained in parallel computing (and we won’t in the near future). By “we”, I mainly mean PhD students. Our background is signal or image processing, not Computer Science and even less parallel computing. Those unused computers could be used for our computations, but this implies that the computer is only used if nobody works on it, that we only use what is available at a precise moment, and that some computers may get used during the computations. That’s why P2P seems an elegant idea, as a grid computing tool.

P2P computation is not new in the lab (we developed P2P-MPI in Java for instance), but for our team, it is. For the time being, I did not find much about the tools that we could use, but the JXTA protocol seems a good start. I hope I will be able to talk more about this subject in the near future.

Today ships my first book on Python for the scientists. Although IT people can learn a lot of Python with it (mainly if they are working in labs are research centers), scientists will be more interested as it presents a viable alternative to Matlab : fast, efficient, a real language with a large standard library.

After an introduction, the Python language is exposed as well as some main modules. The three central chapters are dedicated to Numpy, Scipy and Matplotlib. Each library tackles a specific problem, storing data, using it or display it. Finally, the last chapter exposes ways of speeding up Python with the use of C or C++.

The link to my publisher : here

This question was asked on the Scipy mailing-list last year (well, one week ago). Nathan Bell proposed a skeleton that I used to create an out typemap for SWIG.

  1. %typemap(out) std::vector<double> {
    
  2.     int length = $1.size();
    
  3.     $result = PyArray_FromDims(1, &amp;length, NPY_DOUBLE);
    
  4.     memcpy(PyArray_DATA((PyArrayObject*)$result),&amp;((*(&amp;$1))[0]),sizeof(double)*length);
    
  5. }

This typemap uses obviously Numpy, so don’t forget to initialize the module and to import it. Then there is a strange instruction in memcpy. &((*(&$1))[0]) takes the address of the array of the vector, but as it is wrapped by SWIG, one has to get to the std::vector by dereferencing the SWIG wrapper. Then one can get the first element in the vector and take the address.

Edit on May 2017: This is my most recent trials with this.

  1. %typemap(out) std::vector<float> {
    
  2.     npy_intp length = $1.size();
    
  3.     $result = PyArray_SimpleNew(1, &amp;length, NPY_FLOAT);
    
  4.     memcpy(PyArray_DATA((PyArrayObject*)$result),$1.data(),sizeof(float)*length);
    
  5. }