Monday, March 25, 2013

Devops Toolchains - Rundeck and Chef

This is a write-up of a great presentation by Anthony Shortland at ChefCon 2012 called "Using Rundeck And Chef To Build DevOps ToolChain":
Design points
  1. Everything is code
  2. Everything is packaged
  3. Separate code and config
    1. Code and configuration flow at different rates
  4. Separate env-dependent attributes
    1. More volatile - flows much faster
    2. Separate packages
  5. Balance distributed vs local orchestration
    1. On the deploy side - decision to be made about how to approach orchestration
    2. Distributed
      1. Network level coordinatinon
      2. Multi-box, off-box
      3. Coordinating rolling upgrades
      4. Pool of boxes
      5. Workflow captured separately, captured/encoded independent of environment, e.g. Rundeck, specify business/technical process to orechestrate in distributed manner divorced of environment
      6. Same template, same workflows meet the env-specific data/nodeset & get instantiated at runtime
    3. Local
      1. On-box, within given node
      2. Set of tasks to bring system to target state
      3. Modular automation - capture implementation/code in modules, move them around the network, e.g. chef cookbooks
    4. Separate concerns - met with separate tools fitting together in tool chain
  6. Resolve directed vs convergent orchestration
    1. Directed orchestration: 
      1. single node in network
      2. fire out commands in authoritative manner
      3. Continuous deploy / rolling upgrade - best met with directed orchestration
    2. Convergent orchestration:
      1. Rise of cloud/different design (e.g. chef)
      2. Fuzziness - room for environment to converge
      3. Good for scale / compliance
    3. Not either-or... need a solution that accommodates both 
  7. Integrate application and infrastructure provisioning
    1. "integrate build and deploy"
    2. Want a systemic solution
    3. Single orchestrated provisioning process
    4. Moving away from static infra - we need infra and app up quickly
  8. Design for flow not the organisation
    1. Subordinate organisational differences/divisions - forget those requirements/constraints
    2. Build for business/process flow across the system


1289s

1466s - The tool for the job
  • There's often a number of tools to choose to do the job
  • Choose the job to fit the people in the org
  • Java guy = Ant/Maven
  • Ruby = Rake
  • Systems = Make
  • All languages can build RPMs: Ant/Mave/Rake/Make
  • Tool you choose may vary depending on who's consuming them

2294s - Environment-Specific Application-Level Attributes
  • Mapping databag into context
  • Package version and application state defined per environment within package's databag

2548s - Demo

Build console (Jenkins) does build orchestration, deploy console (Rundeck) does deploy orchestration, Chef does local orchestration.









Puppet vs Fabric for deploys




I've been looking into various configuration management/automated deployment tools lately. At OpenX we used slack, but I wanted something with a bit more functionality than that (although I'm not badmouthing slack by any means -- it can definitely be bent to your will to do pretty much whatever you need in terms of automating your deployments).

From what I see, there are 2 types of configuration management tools:

  1. The first type I call 'pull', which means that the servers pull their configurations and their marching orders in terms of applying those configurations from a centralized location -- both slack and Puppet are in this category. I think this is great for initial configuration of a server. As I described in another post, you can have a server bootstrap itself by installing Puppet (or slack) and then 'call home' to the central Puppet master (or slack repository) and get all the information it needs to configure itself
  2. The second type I call 'push', which means that you send configurations and commands to a list of servers from a centralized location -- Fabric is in this category. I think this is a more appropriate mode for application-specific deployments, where you might want to deploy first to a subset of servers, then push it to all servers.
So, as a rule of thumb, I think it makes sense to use a tool like Puppet for the initial configuration of the OS and of the packages required by your application (things like MySQL, Apache, Tomcat, Tornado, Nginx, or whatever your application relies on). When it comes time to deploy your application, I think a tool like Fabric is more appropriate, since it gives you more immediate and finer-grained control over what you want to do.

I also like the categorization of these tools done by the people at ControlTier. Check out their blog post on Achieving Fully Automated Provisioning (which also links to a white paper PDF) for a nice diagram of hierarchy of deployment tools:

  • at the bottom you have tools that install or launch the initial OS on physical servers (via Kickstart/Jumpstart/Cobbler) or on virtual machines/cloud instances (via various vendor tools, or by rolling your own)
  • in the middle you have what they call 'system configuration' tools, such as Puppet/Chef/SmartFrog/cfengine/bcfg2
  • at the top you have what they call 'application service deployment' tools, such as Fabric/Capistrano/Func -- and of course their own ControlTier tool
In a comment on one of my posts,  Damon Edwards from ControlTier calls Fabric a "command dispatching tool", as opposed to Puppet, which he calls a "configuration management tool". I think this relates to the 2 types of tools I described above, where you 'push' or 'dispatch' commands with Fabric, and you 'pull' configurations and actions with Puppet.





New Whitepaper: Achieving Fully Automated Provisioning


What we are proposing is an outline of an open source toolchain for fully automated provisioning.

What's the criteria for "fully automated provisioning"?

That details are in the whitepaper but here's the list we came up with:

  1. Be able to automatically provision an entire environment -- from "bare-metal" to running business services -- completely from specification
  2. No direct management of individual boxes
  3. Be able to revert to a "previously known good" state at any time
  4. It's easier to re-provision than it is to repair
  5. Anyone on your team with minimal domain specific knowledge can deploy or update an environment
Here's the proposed open source toolchain (with ControlTier and Puppet highlighted):











Damon Edwards said...
Great post.

I found it interesting that you would directly compare a command dispatching tool like Fabric to a configuration management tool like Puppet. 

We did a whitepaper on how these different types of tools can be used as an open source toolchain to achieve fully automated provisioning (going from bare metal all the way to running multitier applications)

The whitepaper is written about a specific ControlTier and Puppet implementation but the toolchain and the general concepts hold for tools like Chef, Fabric, Cobbler, Capistrano, etc.

Here's the link:
http://blog.controltier.com/2009/04/new-whitepaper-achieving-fully.html
Grig Gheorghiu said...
Damon -- I guess my intention wasn't clear. I didn't mean to compare Fabric to Puppet as an apples-to-apples comparison. I was merely wishing that a Puppet-like configuration management tool existed in Python. I know about bcfg2, I haven't played with it though.

Thanks for the pointer to your white paper.

Grig
Damon Edwards said...
Ah got it. You know, that does touch on an excellent subject... when you build a toolchain of management tools, what kind of overhead are you adding because of differing languages or configuration methods? It begs the question, is it easier to bend (or some would say abuse) one tool to do most of everything that you want or should use a toolchain that lets each tool do what its supposed to be good at?


______________________________________________________________

I'm new to Puppet and while I've been using *nix systems for many years, I've never worked as a sysadmin or in ops.
I'm currently writing Puppet manifests for hosting a set of (PHP/MySQL/MongoDB, code in git) web applications. Clearly Puppet needs to have some knowledge of the actual applications because I'll set up a virtual host for each one, but I'm not sure whether Puppet should be managing things like code deployment and database creation.
Is Puppet an appropriate tool for application deployment? If not, can you recommend a more appropriate tool?
share|improve this question
If you're coming along to this later, all the answers are good, don't just read the one that I chose as the answer. – michaeltwofish Mar 25 '12 at 22:28

6 Answers


I'd look into either Capistrano or Fabric for deployments..
You'll have better control over how the deployment happens with these two tools.
share|improve this answer
I +1'd this answer because it gave two solutions. – François Beausoleil Mar 23 '12 at 3:45
I appreciate the pointers. We have Ruby elsewhere in our stack, so I'll look at Capistrano. – michaeltwofish Mar 25 '12 at 22:28

Puppet is used for deployments in many large organizations, but it's not always perfect. Much of it depends on your deployment methodology. Are you deploying to lots of machines at once? Do you do rolling deployments?
Some organizations use Puppet by building packages of their deployments and then having puppet enforce policy to be at the right version of that package. Because puppet has the concept of environments included, you can use environments to do deployments in stages (dev, test, prod for example).
Other organizations use puppet to orchestrate deployment by either firing off an rsync, git checkout or some recursively file copies using puppet (though that is rather slow).
There are other pretty good tools available for deployment too. I have used Whiskey Disk in the past (a simple ruby tool) and liked it a lot.
(Disclaimer, I work at Puppet Labs)
share|improve this answer
Thanks. Deployments are to two balanced servers, currently using manual rsync. – michaeltwofish Mar 23 '12 at 0:52









__________________________________________________________



Issues with MCollective as Deployment Orchestrator
https://groups.google.com/forum/#!msg/mcollective-users/624ZMIOcNf0/0nHwcZ2FVqgJ






Commit-Deploy-Test-Rollback Flowchart from IMVU

Learning Fast With A/B Testing and Continuous Deployment


Larry Wall: 5 Programming Languages Everyone Should Know


Video: https://www.youtube.com/watch?v=LR8fQiskYII

Back then it would have been: Fortran, COBOL, Basic, LISP, APL.

Now...

Javascript

  • If only to know if you should click the button that says to Enable Javascript


Java

  • The elephant in the room
  • The COBOL of the 21st century
  • heavyweight, verbose, everyone loves to hate it
  • Managers love it because it looks like you're getting a lot done... 
    • 100 lines of Java vs 5 lines in another language
    • You can eat a 1 pound steak or 100 pounds of shoe leather - and feel a greater sense of accomplishment after the shoe leather but... maybe there's some downsides
  • Considered industrial, programmers are considered interchangeable... parts.
    • Managers like it for that reason and for that reason a lot of Java jobs have been outsourced

Haskell

  • Functional... not as in other languages are dysfunctional
  • A language for geniuses by geniuses
  • Should know about it if only to be able to say "is this kind of like Haskell?" if so, you know you need to hire some really smart people to program in it
    • A modern LISP in that sense
C
  • Continues to be a fundamental language - if only because everyone is trying to reinvent it but not succeeding
Scripting language - Python, Ruby, PERL
  • Liveliest community
  • Redesigning to leapfrog other languages, removing warts
  • One chance to breaking backward compatibility, break things that need breaking, keep it a joy to use, useful/enjoyable for decades
  • I'd recommend PERL - but I'm known to be prejudiced in the matter