Global Big Data Conference

Industry News Details

10 Big Data Trends to Watch in 2019 Posted on : Jan 22 - 2019

We seek ever more data for a good reason: it’s the commodity that fuels digital innovation. However, turning those huge data collections into actionable insight remains a difficult proposition. Organizations that find solutions to formidable data challenges will be better positioned to economically benefit from the fruits of digital innovation.

With that basic premise in mind, here are 10 trends in big data that forward-looking organizations should look out for in 2019:

1. Data Management Is Still Hard

The big idea behind big data analytics is fairly clear-cut: Find interesting patterns hidden in large amounts of data, train machine learning models to spot those patterns, and implement those models into production to automatically act upon them. Rinse and repeat as necessary.

However, the reality of putting that basic recipe into production is a lot harder than it looks. For starters, amassing data from different silos (see prediction #1) is difficult and requires ETL and database skills. Cleaning and labeling the data for the machine learning training also takes a lot of time and money, particularly when deep learning techniques are used. And finally, putting such a system into production at scale in a secure and reliable fashion requires another set of skills entirely.

For these reasons, data management remains a big challenge, and data engineers will continue to be among the most sought-after personas on the big data team.

2. Data Silos Continue Proliferating

This is not a difficult prediction to make. During the Hadoop boom five years ago, we were entranced with the idea that we could consolidate all of our data – for both analytical and transactional workloads – onto a single platform.

That idea never really panned out, for a variety of reasons. The biggest challenges is that different data types have different storage requirements. Relational database, graph databases, time-series databases, HDFS, and object stores all have their respective strengths and weakness. Developers can’t maximize strengths if they’ve crammed all their data into a one-size-fits-all data lake.

In some cases, amassing lots of data into a single place does make sense. Cloud data stores like S3, for instance, are providing companies with flexible and cost-effective storage, and Hadoop continues to be a cost-effective store for unstructured data storage and analytics. But for most companies, these are simply additional silos that must be managed. They’re big and important silos, of course, but they’re not the only ones.

In the absence of a strong centralizing force, data silos will continue to proliferate. Get used to it. View More

Get the