Due to the end of the free lunch, manufacturers started to provide differents processing units and developers started to go parallel. It’s kind of back to the future, as accelerators existed before today (the x87 FPU started as a coprocessor, for instance). If those accelerators were integrated into the CPU, their instruction set were also.
Today’s accelerators are not there yet. The tools are not ready yet (code translators) and usual programming practices may not be adequate. All the ecosystem will evolve, accelerators will change (GPUs are the main trend, but they will be different in a few years), so what you will do today needs to be shaped with these changes in mind. How is it possible to do so? Is it even possible?