Hadoop Guru: How Hadoop Fits in IT Divisions ?

As a staging layer for analytics

In this scenario, the data is processed and filtered in Hadoop clusters, then fed downstream to a traditional large data warehouse, OLAP cubes or an in-memory analytics platform.

As supplemental storage for an enterprise data warehouse platform

You can use Hadoop clusters as a staging layer for storage behind an EDW or data mart. In fact Hive is an example of a data warehouse infrastructure that’s built on top of Hadoop clusters.

As an acquisition and staging layer for unstructured content

This is one way companies are using Hadoop to filter through social media “noise” to find the good stuff that’s worth knowing about. To do that, you’ll need to couple Hadoop with a sentiment engine.

As an ETL tool

Hadoop doesn’t just handle large amounts of data — it’s fast. Already, some companies are generating so much data in a day that it actually takes their ETL solutions longer than 24 hours to process it. By using Hadoop and Map Reduce to perform the ETL process, they’re able to significantly reduce the time it takes to process that data.

As an exploration engine

What’s cool about Hadoop in this situation is that you can add new data to existing data without having to reindex the entire cluster.

An archive for historical data

Sometimes, you want to archive data, but you also want to be able to access it without the hassle of sending for and uploading the archives. Hadoop allows you to store large amounts of historical data without the tapes, giving you access to that data at any time.

As an enterprise search solution

If you really want to search all your enterprise data, build an indexing infrastructure on top of Hadoop. It scales easily, so it will grow as your data grows. Plus, thanks to the distributed parallel architecture, it’ll be fast, according to Cloudera.

As a data sandbox

Data warehouses are big, but unwieldy, which means if you want to put something in them, you need a plan. Hadoop is much more flexible, so some companies are using it to create a data sandbox where users can play with the data, and then if they find something worthwhile, they can add that query to the data warehouse. This use case should appeal to any company striving to be more “data-driven.”

Of course, there are other great use cases that will apply across many industries — including building a recommendation engine or using Hadoop to evaluate customer churn.

References: Post from Loraine Lawson

Hadoop Guru

Pages

Wednesday, 11 September 2013

How Hadoop Fits in IT Divisions ?

No comments: