Thinking of good practices when developing with accelerators

Due to the end of the free lunch, manufacturers started to provide differents processing units and developers started to go parallel. It’s kind of back to the future, as accelerators existed before today (the x87 FPU started as a coprocessor, for instance). If those accelerators were integrated into the CPU, their instruction set were also.

Today’s accelerators are not there yet. The tools are not ready yet (code translators) and usual programming practices may not be adequate. All the ecosystem will evolve, accelerators will change (GPUs are the main trend, but they will be different in a few years), so what you will do today needs to be shaped with these changes in mind. How is it possible to do so? Is it even possible?
Overview of TotalView, a parallel debugger

Some months ago, I had a TotalView tutorial, thanks to my job. Now, I’ve actually used it to debug one of my parallel applications and I would like to share my experience with fantastic tool.
First TotalView is not only a parallel debugger available on several Linux and Unix platforms. It also is a memory checker (MemoryScape and the TotalView plugin) as well as a reverse debugger, that is, you can roll back the execution of a program, even after it crashed (where it would be useless with a standard debugger like GDB).
The different faces of HPC

For each algorithm and program, there are architectures that are better than others. Some computation may need a lot of FLOPS, but FLOPS are not the only thing to consider. Communication and memory bandwidth and latency are as important as computational power, specially since memory speed and CPU speed are decoupled.

How to promote High Performance Computing ?

I had this discussion with one of my Ph.D. advisors some months ago when we talked about correctly using the computers we had then (dual cores), and I had almost the same one in my new job here: applied maths (finite differences, signal processing, …) graduate students are not taught how to use current computers, so how could they develop an HPC program correctly?

I think it goes even further than that, and it will be a part of this post. What I see is that trainees and newly-hired people (to some extent myself included) lack a lot of basic Computer Science knowledge, and even IT knowledge.
