Project Magnet, providing push-based shuffle, now available in Apache Spark 3.2

1 · LinkedIn · Oct. 20, 2021, 6:30 p.m.
Co-authors: Venkata Krishnan Sowrirajan and Min Shen We are excited to announce that push-based shuffle (codenamed Project Magnet) is now available in Apache Spark as part of the 3.2 release. Since the SPIP vote on Project Magnet passed in September 2020, there has been a lot of interest in getting it into Apache Spark. As of March 2021, 100% of LinkedIn’s Spark workloads’ shuffle data is served through push-based shuffle. As we productionized push-based shuffle at LinkedIn, we also actively eng...