Global Big Data Conference

Industry News Details

What’s Next for Big Data? Thinking Beyond Hadoop with Elastic Object Storage Posted on : May 26 - 2017

In the last decade, some enterprises struggling with growing data volumes and shrinking big data talent pools, saw public cloud as a way to manage both challenges. Creating a data lake in the public cloud — just pouring all the unstructured data into a single massive collection, then using analytics tools to “fish out” the data a business unit needs — initially seemed like a good idea because it was the least path of resistance. However, that solution, as frequently happens, carried the seeds of its own problems.

Storing big data in the public cloud makes it expensive for users, because while sending data to the cloud can be expensive, pulling it out is even more costly. If they try to avoid this by expanding their on-premise Hadoop compute resources by buying more Hadoop data nodes, they incur higher costs by over-provisioning on their computer resources.

Companies have discovered the notion of “data gravity.” As the quantity of data grows, there’s more inertia; it’s harder and more expensive to pull out of the cloud, and as it goes through different iterations and transformations it changes. As such, organizations are trying to avoid having to move data after it’s been stored. They want it to be “hot” from an analytics perspective, then “cold” from a storage perspective. Unfortunately traditional Hadoop deployments don’t give them that flexibility.

In addition, having many smaller data “swamps” only compounds the problem. Users end up with “Hadoop sprawl,” buying and managing many different Hadoop clusters specialized to handle different kinds of analytics – again, incurring high costs, with the added complications of the rigidity of the multiple hardwired clusters and frequent duplication of the data. View More

Get the