Hadoop Sorts a Petabyte in 16.25 Hours and a Terabyte in 62 Seconds

We used Apache Hadoop to compete in Jim Gray’s Sort benchmark. Jim’s Gray’s sort benchmark consists of a set of many related benchmarks, each with their own rules. All of the sort benchmarks measure the time to sort different numbers of 100 byte records. The first 10 bytes of each record is the key and the rest is the value. The minute sort must finish end to end in less than a minute. The Gray sort must sort more than 100 terabytes and must run for at least an hour.
(Link: Hadoop Sorts a Petabyte in 16.25 Hours and a Terabyte in 62 Seconds)

Advertisement

No comments yet... Be the first to leave a reply!

Leave a Reply

Fill in your details below or click an icon to log in:

Gravatar
WordPress.com Logo

Please log in to WordPress.com to post a comment to your blog.

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.