While I was reading an article on Google last Deep Learning achievement, I was reminded of a previous discussion with former colleagues about replacing reservoir simulations with neural networks. At the time, I dismissed the idea as ridiculous due to the complexity of the task and the requirement for the training.
But now, Google seems to have done it. Or have they?
What did Google achieve?
First, let’s recap what Google created, They have a neural network that can predict rain 6 hours in advance better and faster than traditional simulations. They start from a serie of 2D radar images (30 images, taken 2 min apart) and predict the next few hours from these (one for each hour). Their precision is also far better, 25 times more precise, due to their grid being finer.
As this is neural network inference, the process is very fast, and in 10 minutes, you have a forecast that would take hours otherwise (numbers from Google). Indeed, there are many parameters governing physics modeling, and there is on top of that some constraints between fine grids and time resolution, which makes computations even longer.
Is Deep Learning going to replace these physics models?
No, it is not. First of all, Google says so themselves. After 6 hours, you have more than just the geography to take into account. This especially true since Google only used the Northern US for training. This means that you can’t just take Europe radar maps for the last hour and hope to get valid results. If you used a physics model, this would hold, as the geography would be part of the parameters of the model.
Physics models use 3D data, which means that DL models could actually become better, if they also used 3D data as their input. But this would require far more data and far more computational power to train this new model that would still be limited to just one area of the globe.
So what is going on?
Let’s take a step back and think. Neural networks are approximation functions. As such, the 2D network can be thought as a proxy for the real world. Physics models can be used to get additional data on top of the real ground truth that we have as well. And we do expect proxy models to be Machine Learning models (think of the regression, that’s the simplest proxy model!) and faster to get results than physics-based models.
Now, we can always get a more complex proxy, but by definition, the proxy will require more data to be optimized. At the core, what the proxy model does, by its black-box behavior, is to try to learn what the white-box model knows: the physical behavior of one point.
So the conclusion is clear: any Machine Learning (whether it’s a Deep Learning one or a classic one) cannot replace a physics-based model. They will keep on being more and more used, because as a proxy, they can get an answer very quickly, but they will always be limited to what they have been constraint to do by their training data: a window of 6 hours, a specific place on Earth…