Information Age: News, analysis & insight for IT & business leaders

The foundations of big data

18 April 2011  

How Google, Yahoo! and Facebook laid the ground work for 'big data' analytics

Hadoop
In December 2004, Google Labs published papers detailing its cluster computing algorithm and its file system. Software engineer Doug Cutting used these papers to create an open source framework for data-intensive, scalable, distributed computing, named Hadoop. The framework was given an additional boost in June 2009, when a prominent user, Yahoo, made the source code to the version of Hadoop it runs in its data centres publicly available. The next version of Apache Hadoop, expected later in 2011, aims to improve the utilisation, scheduling and management of resources.

MapReduce
A Google-patented software framework for distributed processing of large data sets on compute clusters. Hadoop MapReduce is an open source volunteer project under the Apache Software Foundation that was inspired by Google’s original paper.

NoSQL
A set of non-relational database management systems, popular with many big data advocates. Cassandra is one example, and was developed by social networking giant Facebook to handle its Inbox search feature. It is designed to handle massive amounts of data spread out across many commodity servers while providing a highly available service with no single point of failure.

Hive
Another Facebook contribution to the big data toolset, Hive is a data warehouse infrastructure built on top of Hadoop. It provides tools to enable easy data extract, transform and load functions. This provides a structure for the data that makes it possible to query and analyse the volume of data stored in Hadoop files. It was originally designed to help Facebook deal with explosive growth in its multi-petabyte data warehouse.


Comments 

There are currently no comments on this article

People who read this also read...

 

White Papers

Read article

'Think Lean' When Developing Management System Documentation

Learn how to efficiently and effectively implement a document management system for your organization.

Read article

11 Hiring Trends for 2011

In this document, you'll get the insider info you need to give potential employers what they want and beat your competition in 2011. You'll learn about the most valuable certifications and the game-changing skills that can lead to more job security and stability.

Read article

12 Hiring Manager Secrets to Getting the IT Job You Want

Learn how you can make yourself a more attractive candidate now with PrepLogic's free 12 Hiring Manager Secrets to Getting the Job You Want.

More
Advertisement
div class="banner">