Matt Callanan's Blog: John Allspaw & Paul Hammond: 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr

10 deploys per day Dev & ops cooperation at Flickr John Allspaw & Paul Hammond Velocity 2009
3 billion photos 40,000 photos per second
Dev versus Ops
“It’s not my machines, it’s your code!”
“It’s not my code, it’s your machines!”
Spock Scotty Little bit weird Pulls levers & turns knobs Sits closer to the boss Easily excited Thinks too hard Yells a lot in emergencies
Says “No” all the time Afraid that new fangled things will break the site Fingerpointy
Ops stereotype Because the site breaks unexpectedly Because no one tells them anything Because They say “NO” all the time
Traditional thinking Dev’s job is to add new features Ops’ job is to keep the site stable and fast
Ops’ job is NOT to keep the site stable and fast
Ops’ job is to enable the business (this is dev’s job too)
The business requires change
But change is the root cause of most outages!
Discourage change in the interests of stability or Allow change to happen as often as it needs to
Lowering risk of change through tools and culture
Dev and Ops
Ops who think like devs Devs who think like ops
“But that’s me!”
You can always think more like them

Tools

1.0.1 1.0.2 1.0 1.1 1.2 1.1.1 Desktop software
r2301 r2302 r2306 Web software
Always ship trunk
Everyone knows exactly where to look
Feature ?ags #php if ($cfg['enable_feature_video']){ … } {* smarty *} {if $cfg.enable_feature_beehive} … {/if}
Private betas
Bucket testing
Dark launches

Dev, Ops, and Robots Having a conversation build deploy logs logs alerts monitors IRC search engine

Culture

Failure will happen
If you think you can prevent failure then you aren’t developing your ability to respond

Developers: Remember that someone else will probably get woken up when your code breaks
Ops: provide constructive feedback on current aches and pains
1. Automated infrastructure 2. Shared version control 3. One step build and deploy 4. Feature Tags 5. Shared metrics 6. IRC and IM robots 1. Respect 2. Trust 3. Healthy attitude about failure 4. Avoiding Blame
This is not easy You could just carry on shouting at each other…

Wednesday, March 28, 2012