MapReduce translations, from skyscrapers to Hadoop clusters

In our skyscraper analogy, the work the fire wardens did in getting the per-suite, per-platform handset count would be the Map step in our job, and the work the fire wardens on the 10th, 20th and 30th floors did, in calculating their final platform tallies, would be the Reduce step. A Map step and a Reduce step constitute a MapReduce job. Got that?

Let’s keep going. For the building, the collection of suite numbers and smartphone types in each would represent the keys and values in our input file. We split/partitioned that file into a smaller one for each floor which, just like the original input file, would have suite number as the key and smartphone platform data as the value. For each suite, our mappers (the fire wardens on each floor) created output data with the smartphone platform name as key and the count of handsets for that suite and platform as the value. So the mappers produce output which eventually becomes the Reduce step’s input.

(Full Story: MapReduce translations, from skyscrapers to Hadoop clusters)

Advertisements

No comments yet... Be the first to leave a reply!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: