Nice title, surfing on the many core hype, and with a practical approach! What more could one expect from a book on such an interesting subject?
Content and opinions
Let’s start first with the table of contents. What the authors call many-core machines is GPUs. You won’t see anything else, like MPI or OpenMP. One of the central topics is also big data, but expect for the part that GPU can process data for some algorithms faster than a CPU, it’s not really big data. I would call it large datasets, but not big data, as in terabytes of data!
OK, so let’s start with the core review of the book.
The book starts with an introduction on Machine Learning, what will be addressed, and the authors’ GPU library for Machine Learning. If you don’t know anything about GPUs, you will be given a short introduction, but it may not be enough for you to understand the difference between CPUs and GPUs, and why we don’t run everything on GPUs.
The next part tackles supervised learning. Each time, you get a general description of algorithms, then the GPU implementation, and then results and discussions. The authors’ favorite algorithms seem to be neural networks, as everything is compared to these. It’s the purpose of the first chapter in this part, with the classic Back-Propagation algorithm, or the Multiple BP algorithm. They also cover their own tool to automate training, which is nice, as it is usually the issue with neural networks. The next chapter handles missing data and the different kind of missing data. The way it is handled by the library is through an activation mechanism that seems to work, but I don’t know how reliable the training of such NN is, although the results are quite good, so the training system must work. Then we have Support Vector Machines, and curiously, it’s not the “usual” kernel that is mainly used in the book. Also, for once, you don’t the full definition of a kernel, but in a way, you don’t need it for the purpose of this book. The last algorithm of this part is the Incremental Hypersphere Classifier. After a shorter presentation, the authors deal with the experimental setups, and they also chain IHC with SVM. All things considered, this is a small set of supervised learning algorithms. Compared to what you can find in scikit-learn, it is really a small subset.
The third part is about unsupervised and semi-supervised learning. The first algorithm is the Non-Negative Matrix Factorization, presented as a non linear algorithm (when it obviously is), so quite strange. The semi-supervised approach is to divide the set of original features in subsets that have meaning together (eyes, mouth, nose…). Two different implementation are provided in the library, and presented with results in the book. The second chapter is about the current hype in neural networks: deep learning. After the presentation of the basic tool of Deep Belief Networks, the Restricted Boltzmann Machines, you directly jump to implementation and results. I have to say that the fact that the book didn’t describe the architecture of DBNs properly, and it quite difficult to know what DBNs actually are and how they can give such nice results.
The last part consists of the final chapter, a conclusion and a sum-up of the algorithms described in the book.
The book doesn’t display all the code that was developed. I certainly didn’t want it to do that, because the code is available on SourceForge. It was a good description of several machine learning algorithms ported on the GPU, with results and comparisons, but it didn’t live up to the expectation of the book title and the introduction. A GPU is kind of many-cores, but we know that the future many-cores won’t be that kind of cores only. It also falls short for the big data approach.
1 thought on “
Book review: Machine Learning for Adaptive Many-Core Machines – A Practical Approach”