Data Science, Big Data and MapReduce

To start one’s journey into the world of Data Science and Analytics, it behooves one to gain an understanding of the most touted terms in this field. The paper Data Science and Distributed Intelligence provides a good explanation of the often used terms Big Data, Data Science and Map Reduce. Here is my synopsis of these terms based on a reading of this paper:

Big Data refers to the proliferation of large amounts of data in huge databases and very high rate streaming data that produces it. Examples are : web server logs due to user clicks, cell phone logs, twitter feeds, facebook comments etc. An interesting fact is that 90% of Big Data was produced only in the last 2 years.

Data Science is a term that describes the process of gathering data, analyzing it and obtaining information out of it to produce a “data product”. Such data may involve large unstructured data sets. Hence the term ‘Data Science’ is often used in conjunction with Big Data.Since the processing techniques for Big Data often involve functional programming paradigms map and reduce, this has given rise to the MapReduce algorithm, popularized by Google.

MapReduce is an algorithm that helps mitigate the problem of processing Big Data and large data sets by breaking down the processing into smaller more manageable chunks.
Hadoop is the popular open source implementation of MapReduce

One thought on “Data Science, Big Data and MapReduce”

  1. Very interesting read. I like that the description/definition of “data science” implies that these analytical tools can apply to all kinds of data, “big/unstructured” or otherwise. Many theoretical and applied statisticians are quick to point out that these analytical tools have been/are being used on “small/structured” data quite regularly. Perhaps this is thinly veiled turf protection? 🙂

    It’s shocking to hear that 90% of big data was generated in the past 2 years. We can only expect this trend to accelerate. As a fan of big data & data analytics, I’m intrigued by the exciting opportunities that await us – to pose and solve new problems.

    I look forward to reading and learning more from this blog. Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *