Do you care more about Top Supercomputers in China and NSA or Massive Clusters at Google, Facebook, Microsoft, and Amazon

There is news that China has the world's record for Supercomputer.

The ten fastest supercomputers on the planet, in pictures

Chinese supercomputer clocks in at 33.86 petaflops to break speed record.

A Chinese supercomputer known as Tianhe-2 was today named the world's fastest machine, nearly doubling the previous speed record with its performance of 33.86 petaflops. Tianhe-2's ascendance was revealed in advance and was made official today with the release of the new Top 500 supercomputer list.

The media will gladly write about who has the biggest and most powerful supercomputer.

As one of my friends who has worked on supercomputer data centers said, we realized we could reduce a lot costs in the data center, because the super computer would often have weekly maintenance intervals as well as monthly and quarterly.  Components are constantly failing and yes there is a degree of isolation in the failures, but you need to eventually repair the failures which can mean a complete shut down.  During these shut downs is when data center maintenance can be performed.

But, at Google, Facebook, Microsoft, and Amazon there is no time to shut down services.  100,000s of servers need to run all the time.  

Amazon threw up a supercomputer entry in 2011, and it is still ranked 127.

ListRankSystemVendorTotal CoresRmax (TFlops)Rpeak (TFlops)Power (kW)
06/2013 127 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1  
11/2012 102 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1  
06/2012 72 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1  
11/2011 42 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1

Can you imagine if Google, Facebook, Microsoft, or Amazon put up their clusters as an entry?

Part of companies like Google has as advantage is they have teams of people led by guys like Jeff Dean to really think hard about compute clusters.  Here is a presentation Dean gave 4 years ago.

NewImage

NewImage

Google, Facebook, Microsoft, and Amazon are solving the problem to keep supercomputer performance running 24x7x365 a year.  I think this type of innovation affects us much more than who has the fastest supercomputer which requires hundreds of hours of downtime for maintenance.