Modeling the Path to Higher Efficiency Servers - PUE for Servers?

James Hamilton has a good post on the next point of server differentiation being efficiency at very high temperature.

Next Point of Server Differentiation: Effiiciency at Very High Temprature

High data center temperatures is the next frontier for server competition (see pages 16 through 22 of my Data Center Efficiency Best Practices talk:http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_Google2009.pdf and 32C (90F) in the Data Center). At higher temperatures the difference between good and sloppy mechanical designs are much more pronounced and need to be a purchasing criteria.

The infrastructure efficiency gains of running at higher temperatures are obvious. In a typical data center 1/3 of the power arriving at the property line is consumed by cooling systems. Large operational expenses can be avoided by raising the temperature set point. In most climates raising data center set points to the 95F range will allow a facility to move to a pure air-side economizer configuration eliminating 10% to 15% of the overall capital expense with the later number being the more typical.

James gives 3 downsides to higher temperature.

These savings are substantial and exciting. But, there are potential downsides: 1) increased server mortality, 2) higher semi-conductor leakage current at higher temperatures, 3) increased air movement costs driven by higher fan speeds at higher temperatures. The former, increased server mortality, has very little data behind it. I’ve seen some studies that confirm higher failure rates at higher temperature and I’ve seen some that actually show the opposite. For all servers there clearly is some maximum temperature beyond which failure rates will increase rapidly. What’s unclear is what that temperature point actually is.

In my early career at HP I worked as a reliability engineer and stress tested equipment in extreme cold and heat, analyzing failures. This problem reminds me also of one of the lessons I learned working in distribution logistics at HP and Apple, it is cost prohibitive to design the 99.9999% packaging to ship things, and you need to strike the right balance dependent on what you are shipping and its value.

Intel, AMD, and disk driver vendors will discus their energy efficiency, but just like packaging design, thermal efficiency is not sexy and what people think about for energy efficiency.

The complexity of this is huge.

We also know that the knee of the curve where failures start to get more common is heavily influenced by the server components chosen and the mechanical design. Designs that cool more effectively, will operate without negative impact at higher temperatures. We could try to understand all details of each server and try to build a failure prediction model for different temperatures but this task is complicated by the diversity of servers and components and the near complete lack of data at higher temperatures.

And, here is where James totally gets it.  He says we need models.

We also know that the knee of the curve where failures start to get more common is heavily influenced by the server components chosen and the mechanical design. Designs that cool more effectively, will operate without negative impact at higher temperatures. We could try to understand all details of each server and try to build a failure prediction model for different temperatures but this task is complicated by the diversity of servers and components and the near complete lack of data at higher temperatures.

So, not being able to build a model, I chose to lean on a different technique that I’ve come to prefer: incent the server OEMs to produce the models themselves. If we ask the server OEMs to warrant the equipment at the planned operating temperature, we’re giving the modeling problem to the folks that have both the knowledge and the skills to model the problem faithfully and, much more importantly, they have ability to change designs if they aren’t fairing well in the field. The technique of transferring the problem to the party most capable of solving it and financially incenting them to solve it will bring success.

My belief is that this approach of transferring the risk, failure modeling, and field result tracking to the server vendor will control point 1 above (increased server mortality rate). We also know that the Telecom world has been operating at 40C (104F) for years (see NEBS)so clearly equipment can be designed to operate correctly at these temperatures and last longer than current servers are used. This issue looks manageable.

How do you solve this problem?

One smart guy dev guy, Ade Miller.had a good answer which I hope he’ll blog about soon is calculating PUE for a desktop and server. 

So is it more like a equipment PUE vs. data center PUE.

If server vendors started to publish their equipment PUE. What is the IT load of the motherboard, what is the overhead for the power supply and fans?  Would we be looking to buy the best PUE servers?

For you who get power supplies, fan, and analog devices, this will make a lot of sense. Oh yeh, I also was program manager on the Macintosh II power supplies, and learned a lot from a great development team.

Read more

The Next Footprint - Water

WSJ.com has an article on the next footprint to watch – Water.

Yet Another 'Footprint' to Worry About: Water

Taking a Cue From Carbon Tracking, Companies and Conservationists Tally Hidden Sources of Consumption

By ALEXANDRA ALTER

It takes roughly 20 gallons of water to make a pint of beer, as much as 132 gallons of water to make a 2-liter bottle of soda, and about 500 gallons, including water used to grow, dye and process the cotton, to make a pair of Levi's stonewashed jeans.

Though much of that water is replenished through natural cycles, a handful of companies have started tracking such "water footprints" as a growing threat of fresh-water shortages looms. Some are measuring not just the water used to make beverages and cool factories, but also the gallons used to grow ingredients such as cotton, sugar, wheat, tea and tomatoes. The drive, modeled partly on carbon footprinting, a widely used measurement of carbon-dioxide emissions, comes as groundwater reserves are being depleted and polluted at unsustainable rates in many regions. Climate change has caused glaciers to shrink, eroding vital sources of fresh water. And growing global demand for food and energy is placing even more pressure on diminishing supplies.

View Interactive

See how a variety of common products stack up when it comes to water use.

Two-thirds of the world's population is projected to face water scarcity by 2025, according to the United Nations. In the U.S., water managers in 36 states anticipate shortages by 2013, a General Accounting Office report shows. Last year, Georgia lawmakers tried, unsuccessfully, to move the state's border north so that Georgia could claim part of the Tennessee River.

Lately, water footprinting has gained currency among corporations seeking to protect their agricultural supply chains and factory operations from future water scarcity. Next week, representatives from about 100 companies, including Nike Inc., PepsiCo Inc., Levi Strauss & Co. and Starbucks Corp., will gather in Miami for a summit on calculating and shrinking corporate water footprints. In December, a coalition of scientists, companies and development agencies launched the Water Footprint Network, an international nonprofit

They didn’t discuss data centers in this article, but I am sure someone will take notice soon.

I was surprised to see they discussed the use of models to understand the impact.

Water-management experts have started to build models for "water offset" projects so that beverage companies and other heavy water users can soften their impact by funding water sanitation and conservation projects. PepsiCo recently piloted a program to help rice farmers cultivating 4,000 acres in India switch from flood irrigation to direct seeding, a planting method that requires less water and makes crops more resilient to drought.

Some of the people I have met who focus on modeling software are busier than ever and understand that Energy and Water use need to be built into Green Data Center Models.

Read more

Modeling and Monitoring Top Mistake, Waiting for Risk to Eliminating

I recently had the chance to watch a situation where a person was presented with a strategic decision.  I was presented the same situation.  My decision was made in seconds, deciding what needs to be achieved, how priorities need to be shifted, the overall effects are thought of in minutes. There are unknowns, resolve them, ignore others, but keep moving.

The other person spent the first 24 hours wanting more information; the follow on 24 hours evaluating the various alternatives.  Then finally after 72 hours, deciding on action, and within 12 hours changed their mind again.  As much time as the person spent, collecting more information, analyzing the situation, they still could not make a decision.  And, anything they did now was a waste.  They missed the opportunity.

I was in a conf call last night (sunday) where a group of us were discussing modeling, and it reminded me of how the group of us saw the opportunity and knew we needed to move fast.  In my above example, we made decisions and started moving.  The other way which we didn’t have present in the meeting is “I need more information before I can make a decision. Let’s reduce the risk.”

I haven’t blogged much over the last week as I have been dealing the with crisis situation I mention in the first paragraph, but I’ll get back to blogging as soon I resolve the situation.

Read more

Commoditization of Virtualization Technology, Time to Create a Model

One of the lessons I from Donald Trump, is there are only two things important to executives – does it work and I am getting a deal.

Bottom line: out of all the complexity of Green projects, all the various issues, there are only 2 things an executive wants to hear.

  1. Is it working?
  2. Did we get a good deal?

Anything else is not important.

With the current economy and deployment of virtualized solutions, virtualization has reached the stage of commoditization, and now users are looking for deals. To find the best value, requires looking at the Virtualization Solution holistically and adding up the total costs for the solution. The first place people look is the processor and virtualization software. Experienced hardware oriented people know RAM, Storage, I/O, and networking are next in costs and can have a dramatic impact on the overall performance of a virtualized solution. Next is the monitoring and management of the virtualized environment and how that data can be used to optimize the solution while meetings SLAs.  With rising power costs and climate change managing your cost the performance per watt is prudent.

A term used frequently for this approach is rightsizing. Keeping all these issues in mind, picking the right server hardware and software configurations has a huge effect on your efficiencies.  Whether you are in production for web, database, or applications; or development and test; the choices of what configuration you pick and what combination of virtualized environments are now the big decisions on how efficient you are.

This exercise is analogous to picking your vehicle fleet and your method to load those vehicles. Think like the experts at UPS what vehicle you pick, the route chosen, and how it is loaded all effect the overall cost and service level.  Speaking of UPS they have an interesting white paper on category management.  Category management is one approach to commoditization of things.

image

Model
This last step enhances the category review step from the original eight-step process. The category review process has typically involved perhaps hundreds of work hours to complete. This step needs to be backed by decision support and modeling capabilities. Category managers need to be able to simulate category performance results from changes in various inputs – category strategies, definitions, roles and tactics.

UPS has added the step of modeling which is done by few. 

To get you started on virtual system modeling check out this DMTF document.

The CIM system virtualization model, including CIM schema additions and a set of supporting profile documents, enables the management of system virtualization. Virtualization is a substitution process producing virtual resources which change aspects of the way consumers interact with the resources. These virtual resources are usually based on underlying physical resources, but they may have different properties or qualities. For example, virtual resources may have different capacities or sizes than the underlying resources or may have different qualities of service, such as improved performance or reliability. In system virtualization a host computer system provides the underlying resources that compose virtual computer systems and their constituent virtual devices.

Read more

Strategy of The Fighter Pilot, Modeling Techniques

John Boyd is an innovator in fighter pilot techniques. I found some of his work based on looking for control theory and modeling.

Fast Company has an article about John Boyd’s work that is shorter than the book about his life which I am also reading.

Why is this interesting?

It's all about rapid assessment and adaptation to a complex and rapidly changing environment that you can't control.

This is the challenge to building a green data center.  The issues are constantly changing, and there is much beyond your control.  What you can do is assess the situation and adapt.

The model helps unify the approach.  An example used is Toyota’s organizational approach.

Bower and Hout's classic example -- and one that Boyd also studied -- was Toyota, which designed its organization to speed information, decisions, and materials through four interrelated cycles: product development, ordering, plant scheduling, and production. Self-organized, multifunctional teams at Toyota, they observed, developed products and manufacturing processes in response to demand, turning out new models in just three years compared with Detroit's cycle of four or five.

Systems like Toyota's worked so well, Boyd argued, because of schwerpunkt, a German term meaning organizational focus. Schwerpunkt, Boyd wrote, "represents a unifying medium that provides a directed way to tie initiative of many subordinate actions with superior intent as a basis to diminish friction and compress time." That is, employees decide and act locally, but they are guided by a keen understanding of the bigger picture.

Good models allow you to see things you didn’t know where there.

A modeling approach has the potential to create a convergence of ideas  to drive new innovation, and better education of the team.

John Boyd achieved this in fighter pilot techniques.

Boyd theorized that large organizations such as corporations, governments, or militaries possessed a hierarchy of OODA loops at tactical, grand-tactical (operational art), and strategic levels. In addition, he stated that most effective organizations have a highly decentralized chain of command that utilizes objective-driven orders, or directive control, rather than method-driven orders in order to harness the mental capacity and creative abilities of individual commanders at each level. In 2003, this power to the edge concept took the form of a DOD publication "Power to the Edge: Command...Control...in the Information Age" by Dr. David S. Alberts and Richard E. Hayes. Boyd argued that such a structure creates a flexible "organic whole" that is quicker to adapt to rapidly changing situations. He noted, however, that any such highly decentralized organization would necessitate a high degree of mutual trust and a common outlook that came from prior shared experiences. Headquarters needs to know that the troops are perfectly capable of forming a good plan for taking a specific objective, and the troops need to know that Headquarters does not direct them to achieve certain objectives without good reason.

Trust is a key ingredient to these ideas.  Modeling can be used to promote trust.

Note: OODA loop is described here.

Read more