James Hamilton analyzes Google’s Data Center publication.
A Small Window into Google's Data Centers
Google has long enjoyed a reputation for running efficient data centers. I suspect this reputation is largely deserved but, since it has been completely shrouded in secrecy, that’s largely been a guess built upon respect for the folks working on the infrastructure team rather than anything that’s been published. However, some of the shroud of secrecy was lifted last week and a few interesting tidbits were released in Google Commitment to Sustainable Computing.
On server design (Efficient Servers), the paper documents the use of high-efficiency power supplies and voltage regulators, and the removal of components not relevant in a service-targeted server design. A key point is the use of efficient, variable-speed fans. I’ve seen servers that spend as much as 60W driving the fans alone. Using high efficiency fans running at the minimum speed necessary based upon current heat load can bring big savings. An even better approach is employed by Rackable Systems in their ICE Cube Modular Data Center design (First Containerized Data Center Announcement) where they eliminate server fans entirely.
Parts I liked about James efforts are #1 – he discusses water.
It’s good to see water conservation brought up beside energy efficiency. It’s the next big problem for our industry and the consumption rates are prodigious. To achieve efficiency, most centers have cooling towers which allow them to avoid the use of energy-intensive direct-expansion chillers except under unusually hot and humid conditions. This is great news from an energy efficiency perspective, but cooling towers consume water in two significant ways. The first are evaporative losses which are hard to avoid in wet tower designs (other less water-intensive designs exist). The second is caused by the first. As water evaporates from the closed system, the concentrations of dissolved solids and other contaminants present in the supply water left behind by evaporation continue to rise. These high concentrations are dumped from the system to protect it and this dumping is referred to as blow-down water. Between make-up and blow-down water, a medium-sized, 10MW facility, built to current industry conventions, can go through ¼ to ½ million gallons of water a day.
The paper describes a plan to address this problem in the future by moving to recycled water sources. This is good to see but I argue the industry needs to reduce overall water consumption, whether the source is fresh or recycled. The combination of higher data center temperatures and aggressive use of air-side economization are both good steps in that direction and industry-wide we’re all working hard on new techniques and approaches to reduce water consumption.
Then #2 PUE.
The section on PUE is the most interesting in that the are documenting an at-scale facility running at a PUE of 1.13 during a quarter. Generally, you want full-year numbers since these numbers are very load and weather dependent. The best annual number quoted in the paper is 1.15 which is excellent. That means that for every watt delivered to servers 0.15W is lost in power distribution and cooling.
This number, with pure air-side cooling and good overall center design, is quite attainable. But, elsewhere in the document, they described the use of cooling towers. Attaining a PUE of 1.15 with a conventional water-based cooling system is considerably more difficult. On the power distribution side, conventional designs waste about 8% to 9% of the power delivered. A rough breakdown of where it goes is 3 transformers taking 115KV down to 13.2KV down to 480KV and then down to 208KV for delivery to the load. Good transformer designs run around 99.7% efficiency. The uninterruptable power supply can be as poor as 94%, and roughly 1% is lost in switching and conductors. That approach gets us to 8% lost in distribution. We can easily eliminate one layer of transformers and either use a high efficiency bypass UPS. Let’s use 97% efficiency for the UPS. Those two changes will get us 4% to 5% lost in distribution. Let’s assume we can reliably hit 5% power distribution losses. That leaves us with 10% for all the losses to the mechanical systems. Powering the Computer Room Air Handlers, the water pumps etc. at only 10% overhead would be both difficult and more impressive.
The 1.15 PUE with pure air-side economization in the right climate looks quite reasonable, but powering a conventional, high-scale, air and water, multi-conversion cooling system at this efficiency looks considerably harder to me. Unfortunately, there is no data published in the paper on the approach and whether it was simply attained by relying on favorable weather conditions and air-side economization with the water loops idle.
And last, #3
The paper concludes that “if all data centers operated at the same efficiency as ours, the U.S. alone would save enough electricity to power every household within the city limits of Atlanta, Los Angeles, Chicago, and Washington, D.C.”. This is hard to independently verify without much more information than offered by the paper. Most of the techniques employed are not discussed in the paper published last week. If the large service providers like Google, Microsoft, Yahoo, Beidu, Amazon and a handful of others don’t publish the details, the rest of the world’s data centers will never run as efficiently as described in the paper. Only high-scale datacenter users can afford the R&D program to spend on increased efficiency and water consumption elimination. I’m arguing it’s up to all of us working in the industry to publish the details to allow smaller-scale deployments to operate at similar efficiency levels. If we don’t, it’ll continue to be the case that US data centers alone will be needlessly spending enough power to support every household in Atlanta, Los Angeles, Chicago, and Washington DC. Each day, every day.
Hopefully other press and bloggers will read James’s post, and provide other points beyond Google’s marketing efforts.