As it turns out, NVIDIA just open sourced a product called Legate which does not just GPUs but distributed as well. Right now it supports NumPy and Pandas but perhaps they'll add others in the future. Just thought this might be up your alley since it works at a higher level than the glorified CUDA in the article.
https://github.com/nv-legate/legate.numpy
Disclaimer: I work on the project they used to do the distributed execution, but otherwise have no connection with Legate.
Edit: And this library was developed by a team managed by one of the original Copperhead developers, in case you're wondering.