Category

Spark Scala

Goodbye MapReduce, Hello Spark

For those of you not familiar with Spark, it is a cluster computing framework developed in AMPLab at UC Berkeley. Unlike MapReduce, which writes its data to disk between steps, Spark attempts to perform all of its computations in memory which can yield significant performance improvements. It is...

Staff Engineer