using software defined networking to enforce partitions, speeding up failure detection.
Nice paper that summarizes the historical papers related to concurrency models.
“only way to stop is to crash. only way to start is to recover”
Another good Joe Armstrong talk.
Free ebook on debugging Erlang systems.
Algorithms for approximate consensus. Nodes can asymptotically all converge on (roughly) the same value with very small amount of inter-node communication.