We used Apache Hadoop to compete in Jim Gray’s Sort benchmark. Jim’s Gray’s sort benchmark consists of a set of many related benchmarks, each with their own rules. All of the sort benchmarks measure the time to sort different numbers of 100 byte records. The first 10 bytes of each record is the key and the rest is the value. The minute sort must finish end to end in less than a minute. The Gray sort must sort more than 100 terabytes and must run for at least an hour.
(Link: Hadoop Sorts a Petabyte in 16.25 Hours and a Terabyte in 62 Seconds)
Hadoop Sorts a Petabyte in 16.25 Hours and a Terabyte in 62 Seconds
Advertisement


May 14, 2009

No comments yet... Be the first to leave a reply!