Tag Archives: Python

My favorite design pattern in Python

I’ve noticed some days ago that I mainly used one design pattern in my scientific (but not only) code, the registry. How does it work? A registry is a list/dictionary/… of objects, applications add a new entry if it is needed, and then a user can tap into the registry to find the most adequate object for one’s purpose.

Continue reading My favorite design pattern in Python

Parallel computing in large-scale applications

In March 2008 issue, IEEE Computers published a case study on large-scale parallel scientific code development. I’d like to comment this article, a very good one in my mind.

Five research centers were analyzed, or more precisely their development tool and process. Each center did a research in a peculiar domain, but they seem share some Computational Fluid Dynamics basis.

Continue reading Parallel computing in large-scale applications

Grid computing for Python

In my lab, we frequently process huge amounts of data, each process can take hours or days. The problem is that we don’t have a usable tool to do this.

Our legacy software is in C and we plan on moving to Python in the next weeks. We could use some commercial software, but it is not optimal.

This is where P2P comes into the game. We have a lot of unused computers or dual cores that are not used even at 50% because we are not trained in parallel computing (and we won’t in the near future). By “we”, I mainly mean PhD students. Our background is signal or image processing, not Computer Science and even less parallel computing. Those unused computers could be used for our computations, but this implies that the computer is only used if nobody works on it, that we only use what is available at a precise moment, and that some computers may get used during the computations. That’s why P2P seems an elegant idea, as a grid computing tool.

P2P computation is not new in the lab (we developed P2P-MPI in Java for instance), but for our team, it is. For the time being, I did not find much about the tools that we could use, but the JXTA protocol seems a good start. I hope I will be able to talk more about this subject in the near future.

A new French book on scientific computing with Python

Today ships my first book on Python for the scientists. Although IT people can learn a lot of Python with it (mainly if they are working in labs are research centers), scientists will be more interested as it presents a viable alternative to Matlab : fast, efficient, a real language with a large standard library.

After an introduction, the Python language is exposed as well as some main modules. The three central chapters are dedicated to Numpy, Scipy and Matplotlib. Each library tackles a specific problem, storing data, using it or display it. Finally, the last chapter exposes ways of speeding up Python with the use of C or C++.

The link to my publisher : here

Transforming a C++ vector into a Numpy array

This question was asked on the Scipy mailing-list last year (well, one week ago). Nathan Bell proposed a skeleton that I used to create an out typemap for SWIG.

  1. %typemap(out) std::vector<double> {
    
  2.     int length = $1.size();
    
  3.     $result = PyArray_FromDims(1, &amp;length, NPY_DOUBLE);
    
  4.     memcpy(PyArray_DATA((PyArrayObject*)$result),&amp;((*(&amp;$1))[0]),sizeof(double)*length);
    
  5. }

This typemap uses obviously Numpy, so don’t forget to initialize the module and to import it. Then there is a strange instruction in memcpy. &((*(&$1))[0]) takes the address of the array of the vector, but as it is wrapped by SWIG, one has to get to the std::vector by dereferencing the SWIG wrapper. Then one can get the first element in the vector and take the address.

Edit on May 2017: This is my most recent trials with this.

  1. %typemap(out) std::vector<float> {
    
  2.     npy_intp length = $1.size();
    
  3.     $result = PyArray_SimpleNew(1, &amp;length, NPY_FLOAT);
    
  4.     memcpy(PyArray_DATA((PyArrayObject*)$result),$1.data(),sizeof(float)*length);
    
  5. }

Creating a Python module with Scons and SWIG

Some times ago, I proposed an optional build for SWIG if the SWIG binary was not found on the system. Here I propose an enhancement, a new library builder that will be registered in the environment env as PythonModule. It takes the same arguments as a classical SharedLibrary, but it does some additional steps :

  • It forces SWIG to create a Python wrapper (flag -python)
  • It checks if SWIG is present at all
  • It suppresses every prefix that the system might need (as lib in Linux)
  • On Windows and for Python >= 2.5, it changes the extension as pyd

Continue reading Creating a Python module with Scons and SWIG

Enabling thread support in SWIG and Python

I was looking for some days in SWIG documentation how I could release the GIL (Global Interpreter Lock) with SWIG. There were some macros defined in the generated code, but none was used in any place.

In fact, I just had to enable the thread support with an additional argument (-threads) and now every wrapped function releases the GIL before it is called, but that does not satisfy me. Indeed, some of my wrappers must retain the GIL while they are used (see this item). So here are the features that can be used :

  • nothread enables or disables the whole thread lock for a function :
    • %nothread activates the nothread feature
    • %thread disables the feature
    • %clearnothread clears the feature
  • nothreadblock enables or disables the block thread lock for a function :
    • %nothreadblock activates the nothreadblock feature
    • %threadblock disables the feature
    • %clearnothreadblock clears the feature
  • nothreadallow enables or disables the allow thread lock for a function :
    • %nothreadallow activates the nothreadallow feature
    • %threadallow disables the feature
    • %clearnothreadallow clears the feature

When the whole thread lock is enabled, the GIL is locked when entering the C function (with the macro SWIG_PYTHON_THREAD_BEGIN_BLOCK). Then it is released before the call to the function (with SWIG_PYTHON_THREAD_BEGIN_ALLOW), retained after the end (SWIG_PYTHON_THREAD_END_ALLOW) and finally it is released when exiting the function (SWIG_PYTHON_THREAD_END_BLOCK), after all Python result variables are created and/or modified.

Wrapping a C++ container in Python

When moving to Python, the real big problem that arises is the transformation of a Python array into the C++ container the team used for years.

Let’s set some hypothesis :

  • there is a separation between the class containing the data and the class that uses the data (iterators, …)
  • the containing class can be changed (policy or strategy pattern)

The first hypothesis is derived from the responsibility principle, the two classes have two distinct responsibilities, the first allocates the data space and allows simple access to it, the second allows usual operations (assignation, comparison tests or iterations for instance).

The second one will be the heart of the wrapper. It allows to change the way data is stored and accessed in a simple way.
Continue reading Wrapping a C++ container in Python

Deformation fields with thin-plates

For my research, I had to create a set of smooth deformation fields where I knew which points were moved and by which amount.

I tried to find a script, but I couldn’t find an appropriate one, not even talking about one in Python. So here I propose my own version, allowing to interpolate a 1D, 2D or 3D deformation field based on some points.

How does it work ? It is based on Bookstein’s algorithm. The first step is the computation of the coefficients of the smooth deformation field and then they are used to compute the values on the deformation field on a grid and this grid is returned.

The function to use is denseDeformationFieldFromSparse(), the arguments being size, the size of the desired grid, points, the locations where the deformation field is known, and displacements, the amount of displacement for each previously given point.

This code is given as is, but feel free to comment so that bugs can be ironed out (if there are bugs). It was tested with 1D, 2D and 3D test cases which can also be found on the gist.

Thanks to Bill Baxter for the distance function that was proposed on the numpy discussion list.