Momentum Builds for what the smart people know, Moving out of AWS Can be a good Move

July 26, 2014 Dave Ohara

AWS’s slowing growth is all over the news. Here are two different views of what is causing the slowing growth.

The NYtimes’s Quentin Hardy says the problem is AWS needs a bigger sales team for the business market.

What Amazon’s service does have is a great roster of named clients, and probably lots more companies that aren’t ready to admit somebody else runs their computers. It has an enormous cloud and a technical understanding of global-scale computing that is second to none. All it needs is a bigger sales team for businesses and a way to get its checks signed faster.

Zynga and Sony moved out of AWS years ago.

Lessons from Zynga & Sony on moving from Amazon AWS

Published on March 14, 2012 by Nati Shalom in Cloud, Cloud Computing, EC2, GigaSpaces, OpenStack, syndicated

Earlier this month Zynga announced its move from Amazon AWS to its own private Z-Cloud. Sony also started to move increasing parts of its workload from Amazon to Rackspace OpenStack.

There isn't so much in common between these different use cases, except for the fact that they may indicate the beginning of a trend (I’ll get back to that toward the end) where companies start to take more control over their cloud infrastructure.

So what really brought Zynga and Sony to make such a move?

MOZ dumped AWS.

Moz Dumps Amazon Web Services, Citing Expense and ‘Lacking’ Service

[Updated, 1/31/14, 12:01 pm] Seattle marketing technology company Moz had a worse-than-expected 2013 in terms of profitability and products. But what really jumped out at me in the privately held company’s startlingly frank review of the year was new CEO Sarah Bird’s blunt criticism of Amazon Web Services (AWS), which she says the company is leaving for reasons of cost, product stability, and service.

Gigaom’s Barb Darrow says the AWS problem is companies are leaving AWS, prices are dropping, and competition is intense.

First let’s start with the facts.

AWS sales dipped this quarter. Amazon announced Thursday that for its second quarter, which ended June 30, the category that includes AWS saw a 3 percent sequential revenue slip. That “other” category — which also includes advertising services and co-branded credit card agreements — also logged 38 percent growth year over year. That sounds great until you realize year-over-year growth in the first quarter was 60 percent. There have been other slight quarterly dips in the category’s otherwise relentless rise over the past few years, but they’ve mostly happened between fourth and first quarters.

The news is starting to leak that another company has joined the move out of AWS.

However, a source familiar with Dropbox’s current strategy said the company lately has been moving more of its IT infrastructure away from AWS and onto its own turf. There are now 10,000 servers in Dropbox facilities running loads that had been on Amazon EC2, although it’s not clear what percentage of Dropbox’s computing requirements that represents. Dropbox is currently storing data both in its own data centers and on Amazon S3 until the end of the year, this source said.

In closing Barb thinks AWS’s future has more pressure.

So as rival public cloud powers add services and cut prices, and as more customers see the benefits of hybrid as opposed to pure public cloud computing, expect the pressure on AWS to ratchet up.

AWS is in an out war with for the cloud with Google, Microsoft, and many others.

With this news of Dropbox moving out I would not want to be an internal AWS employee. Jeff Bezos has got to be livid. When internal PR shows the NYTimes saying all we need is more sales people I doubt that would calm the troops.

15 years ago Google placed its largest server order and did something big starting site reliability engineering

July 23, 2014 Dave Ohara

Google’s Urs Hölzle posted on Google placing its largest server order in its history 15 years ago.

Urs Hölzle
Shared publicly - 11:41 AM

15 years ago we placed the largest server offer in our history: 1680 servers, packed into the now infamous "corkboard" racks that packed four small motherboards onto a single tray. (You can see some preserved racks at Google in Building 43, at the Computer History Museum in Mountain View, and at the American Museum of Natural History in DC,http://americanhistory.si.edu/press/fact-sheets/google-corkboard-server-1999.)

At the time of the order, we had a grand total of 112 servers so 1680 was a huge step. But by the summer, these racks were running search for millions of users. In retrospect the design of the racks wasn't optimized for reliability and serviceability, but given that we only had two weeks to design them, and not much money to spend, things worked out fine.

I read this thinking how impactful was this large server order. Couldn’t figure what I would post on how the order is significant.

Then I ran into this post on Site Reliability Engineering dated Apr 28, 2014, and realized there was a huge impact by Google starting the idea of a site reliability engineering team.

Here is one the insights shared.

The solution that we have in SRE -- and it's worked extremely well -- is an error budget. An error budget stems from this basic observation: 100% is the wrong reliability target for basically everything. Perhaps a pacemaker is a good exception! But, in general, for any software service or system you can think of, 100% is not the right reliability target because no user can tell the difference between a system being 100% available and, let's say, 99.999% available. Because typically there are so many other things that sit in between the user and the software service that you're running that the marginal difference is lost in the noise of everything else that can go wrong.

If 100% is the wrong reliability target for a system, what, then, is the right reliability target for the system? I propose that's a product question. It's not a technical question at all. It's a question of what will the users be happy with, given how much they're paying, whether it's direct or indirect, and what their alternatives are.

The business or the product must establish what the availability target is for the system. Once you've done that, one minus the availability target is what we call the error budget; if it's 99.99% available, that means that it's 0.01% unavailable. Now we are allowed to have .01% unavailability and this is a budget. We can spend it on anything we want, as long as we don't overspend it.

Here is another rule that is good to think about when running operations.

One of the things we measure in the quarterly service reviews (discussed earlier), is what the environment of the SREs is like. Regardless of what they say, how happy they are, whether they like their development counterparts and so on, the key thing is to actually measure where their time is going. This is important for two reasons. One, because you want to detect as soon as possible when teams have gotten to the point where they're spending most of their time on operations work. You have to stop it at that point and correct it, because every Google service is growing, and, typically, they are all growing faster than the head count is growing. So anything that scales headcount linearly with the size of the service will fail. If you're spending most of your time on operations, that situation does not self-correct! You eventually get the crisis where you're now spending all of your time on operations and it's still not enough, and then the service either goes down or has another major problem.

Building the Best Software Services, can you find the secret guild?

July 23, 2014 Dave Ohara

I have been the bay area for the past two weeks for business meetings before I head back to Redmond. Actually haven’t been here for two weeks straight, taking two trips. I’ve lived for 22 years in Redmond, and before that spent 32 years in Silicon Valley. I go back and forth often enough that I have an office space in both locations. How Silicon Valley works is different than Seattle/Redmond, but there is a common trait. The guys who belong to the secret guild of low level programmers who can build services that scale and run like an energizer bunny. Working on OS at Apple and Microsoft got me used to working with the developers who belong to the secret guild.

What is the secret guild? Here is a post that tells the story.

the secret guild of silicon valley

A couple of weeks ago, I was drinking beer in San Francisco with friends when someone quipped:

"You have too many hipsters, you won’t scale like that. Hire some fat guys who know C++."

It’s funny, but it got me thinking. Who are the “fat guys who know C++”, or as someone else put it, “the guys with neckbeards, who keep Google’s servers running”? And why is it that if you encounter one, it’s like pulling on a thread, and they all seem to know each other?

The reason is because the top engineers in Silicon Valley, whether they realize it or not, are part of a secret Guild. They are a confraternity of craftsmen who share a set of traits:

...

Read the post to get the rest of story.

For those of you too lazy to click on the link, here is the closing paragraphs.

Finally, the implicit compact that the Guild makes with a company is that their efforts will not be in vain. The most powerfully attractive force for the Guild is the promise of building a product that will get into the happy hands of hundreds, thousands, or millions. This is the coveted currency that even companies that have struggled to build an engineering reputation, like foursquare, can offer.

The Guild of Silicon Valley is largely invisible, but their affiliations have determined the rise and fall of technology giants. The start-ups who recognize the unsung talents of its members today will be tomorrow’s success stories.

Turbines Blade Pressure causes Bat's Trauma, not impact

July 23, 2014 Dave Ohara

Telegraph has a post reporting that Bats are dying because of the turbine blade pressures, not impact.

Bats get ‘the Bends’ when they fly too near wind turbines, experts have claimed.

Queen’s University Belfast said pressure from the turbine blades causes a similar condition as that experienced by divers when the surface too quickly.

Conservationists have warned that the bodies of bats are frequently seen around the bases of turbines, but it was previously assumed they had flown into the blades.

However, Dr Richard Holland claims that bats suffer from ‘barotrauma’ when the approach the structures which can pop their lungs from inside their bodies.

Suggested answer by Dr. Holland is to turn off the turbines during migration.

Dr Holland said energy companies should consider turning off turbines when bats are migrating.

"We know that bats must be 'seeing' the turbines, but it seems that the air pressure patterns around working turbines give the bats what's akin to the bends," he said.

The effect on wildlife of wind turbines is slowly being discovered.

Salon reports that offshore wind farms are helping seals find food.

Enlarge(Credit: visceralimage/Shutterstock)

Go wind power! For once, the green energy source has made the news for the wildlife itdoesn’t inadvertently slaughter — and that it may even be helping to thrive. Offshore wind farms, finds a study published today in the journal Current Biology, are making more food available for seals.

A farm off the coast of Germany, researchers found, is acting as an “artificial reef,” attracting fish and crustaceans and the grey and harbor seals that feed on them.

Think Google Infrastructure will hit $3bil/Qtr in Q4 2014 or Q1 2015?

July 22, 2014 Dave Ohara

Google’s data center group is on a growth curve that is mind blowing. Last quarter was $2.65 Bil.

Note not all of this spend is the data center group.

When you stare at this graph it seems like the $3bil mark is only a few quarters away.

If you are a believer in size brings efficiency, then Google is clearly one of the leaders.