Back

 Industry News Details

 
The Big-Data Tool Spark May Be Hotter Than Hadoop, But It Still Has Issues Posted on : Jan 28 - 2015

Hadoop is hot. But its kissing cousin Spark is even hotter.

Indeed, Spark is hot like Apache Hadoop was half a decade ago. Spawned at UC Berkeley’s AMPLab, Spark is a fast data processing engine that works in the Hadoop ecosystem, replacing MapReduce. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and iterative algorithms, like those commonly found in machine learning and graph processing.

San Francisco-based Typesafe, sponsors of a popular survey on Java developers I wrote about last year and the commercial backers of Scala, Play Framework, and Akka, recently conducted a survey of developers about Spark. More than 2,000 (2,136 to be exact) developers responded. Of the findings, three conclusions jump out:

Spark awareness and adoption are seeing hockey-stick-like growth. Google Trends confirms this. The survey shows that 71% of respondents have at least evaluation or research experience with Spark, and 35% are now using it or plan to use it.

Faster data processing and event streaming are the focus for enterprises. By far the most desirable features are Spark's vastly improved processing performance over MapReduce (over 78% mention this) and the ability to process event streams (over 66% mention this), which MapReduce cannot do.

 

Perceived barriers to adoption are not major blockers. When asked what's holding them back from the Spark revolution, respondents mentioned their own lack of experience with Spark and the need for more detailed documentation, especially for more advanced application scenarios and performance tuning. They mentioned perceived immaturity, in general, and also integration with other middleware, like message queues and databases. Lack of commercial support, which is still spotty even by the Hadoop vendors, was also a concern. Finally, some respondents mentioned that their organizations aren't in need of big data solutions at this time. Source