Sunday, January 18, 2015

Great AWS Tips

Came across some great AWS tips (care of Devops Weekly)

The list is below - go to this website for the full details: 

  • Application Development
    • Store no application state on your servers.
    • Store extra information in your logs.
    • If you need to interact with AWS, use the SDK for your language.
    • Have tools to view application logs.
  • Operations
    • Disable SSH access to all servers.
    • Servers are ephemeral, you don't care about them. You only care about the service as a whole.
    • Don't give servers static/elastic IPs.
    • Automate everything.
    • Everyone gets an IAM account. Never login to the master.
    • Get your alerts to become notifications.
  • Billing
    • Set up granular billing alerts.
  • Security
    • Use EC2 roles, do not give applications an IAM account.
    • Assign permissions to groups, not users.
    • Set up automated security auditing.
    • Use CloudTrail to keep an audit log.
  • S3
    • Use "-" instead of "." in bucket names for SSL.
    • Avoid filesystem mounts (FUSE, etc).
    • You don't have to use CloudFront in front of S3 (but it can help).
    • Use random strings at the start of your keys.
  • EC2/VPC
    • Use tags!
    • Use termination protection for non-auto-scaling instances. Thank me later.
    • Use a VPC.
    • Use reserved instances to save big $$$.
    • Lock down your security groups.
    • Don't keep unassociated Elastic IPs.
  • ELB
    • Terminate SSL on the load balancer.
    • Pre-warm your ELBs if you're expecting heavy traffic.
  • ElastiCache
    • Use the configuration endpoints, instead of individual node endpoints.
  • RDS
    • Set up event subscriptions for failover.
  • CloudWatch
    • Use the CLI tools.
    • Use the free metrics.
    • Use custom metrics.
    • Use detailed monitoring.
  • Auto-Scaling
    • Scale down on INSUFFICIENT_DATA as well as ALARM.
    • Use ELB health check instead of EC2 health checks.
    • Only use the availability zones (AZs) your ELB is configured for.
    • Don't use multiple scaling triggers on the same group.
  • IAM
    • Use IAM roles.
    • Users can have multiple API keys.
    • IAM users can have multi-factor authentication, use it!
  • Route53
    • Use ALIAS records.

  • Elastic MapReduce
    • Specify a directory on S3 for Hive results.
  • Miscellaneous Tips
    • Scale horizontally.
    • Your application may require changes to work on AWS.
    • Always be redundant across availability zones (AZs).
    • Be aware of AWS service limits before you deploy.
    • Decide on a naming convention early, and stick to it.
    • Decide on a key-management strategy from the start.
    • Make sure AWS is right for your workload.

Sunday, January 11, 2015

Henrik Kniberg's "Scaling Agile at Spotify"

Henrik Kniberg's done it again. He's taken a working example of complex process implementation and compressed it into a highly informative, easily-watchable and easily-readable format that makes it simple to grab the concepts and, crucially, to help show others to inspire those around you with what's possible in your IT department.

If you've read Henrik's Scrum and XP from the Trenches or Kanban vs Scrum or Lean from the Trenches, you need to see what he's been helping Spotify do with their engineering culture with scaling agile, lean, devops, culture, A/B experiments.

Two-part animated series depicting Spotify's Engineering culture

Part 1:

Part 2:

Scaling Agile at Spotify with Tribes, Squads, Chapters and Guilds

Tuesday, January 6, 2015

Little's Law

Little's Law:

Avg queue time = WIP / Throughput
Delivery rate = WIP / Lead Time
MeanResponseTime = MeanNumberInSystem / MeanThroughput


Avg Inventory = Throughput * Avg queue time
Avg Customers in system = Avg arrival rate * avg customer time in system

Sunday, January 4, 2015

Experimenting with AWS EC2 Container Service

Amazon EC2 Container Service ("ecs" for short) is a Docker cluster management service that runs on top of EC2 instances.  There is no additional charge for the service - you pay for the EC2 instances whether you're using them or not.  It's early days but looks like a promising service that should take a lot of the grunt work (networking, security, etc) out of creating your own clusters like Kubernetes and Mesos.

ECS is currently in preview - I needed to wait around two weeks to be granted access after signing up here:

This is a transcript of how I fired up a simple Docker container on ECS using Amazon instructions available on 4/Jan/2015.


Watch this video:
Particularly from 1m58s to skip the Amazon propaganda and watch the interesting visualisation.

And this video has some good terminology introduction and a live demo:  (Slides)


  • Tasks: A grouping of related containers (e.g. Nginx, Rails app, MySql, Log collector)
  • Containers
  • Clusters: A grouping of container instances - a pool of resources for Tasks
  • Container Instances: An EC2 instance on which Tasks are scheduled. AMI with ecs agent installed

Setting Up

Follow these instructions:

This will walk you through the setup for the following:

  • IAM User
  • IAM Role
  • Key Pair
  • VPC
  • Security Group
  • Special copy of AWS CLI that includes "ecs" commands
    • NOTE: On OS X I needed to:
      • "brew uninstall awscli" (that removed /usr/local/bin/aws from my path)
      • And add "export PATH=$PATH:~/.local/lib/aws/bin" to my .bashrc

Creating The Cluster

Follow these instructions:

NOTE: I preferred to create the EC2 instance from the command line (instead of the Launch an Instance with the Amazon ECS AMI instructions): 
aws ec2 run-instances --image-id ami-34ddbe5c --count 1 --instance-type t2.small --subnet-id subnet-xxxxxxxx --key-name ecsdemo-keypair --iam-instance-profile Name=ecsdemo-role

... using the subnet-id for my default VPC and the "ecsdemo" keypair and IAM role name I created during the Setting Up phase above.

Then as per the instructions test it out with:

aws ecs list-container-instances

If you see
    "containerInstanceArns": []
... then something has gone wrong and you'll need to terminate your instance and try again.

To see more details about your instance:

aws ecs describe-container-instances

Running a Task (Docker process)

As per the instructions, register a Task Definition and start a Task that spins up a single docker container (based on busybox image) that simply sleeps for 6 minutes.

aws ecs register-task-definition --family sleep360 --container-definitions "[{\"environment\":[],\"name\":\"sleep\",\"image\":\"busybox\",\"cpu\":10,\"portMappings\":[],\"entryPoint\":[\"/bin/sh\"],\"memory\":10,\"command\":[\"sleep\",\"360\"],\"essential\":true}]"

aws ecs list-task-definitions
aws ecs run-task --cluster default --task-definition sleep360:1 --count 1
aws ecs list-tasks
aws ecs describe-tasks --tasks 699d5420-1d0d-410e-b105-7e51027b8fd4

Log on to your instance and check the docker container is running:

ssh -i ecsdemo-keypair.pem ec2-user@ec2-instance-public-ip

docker ps
Should see:

[ec2-user@ip-ec2-instance-public-ip-name ~]$ docker ps
CONTAINER ID        IMAGE                            COMMAND             CREATED             STATUS              PORTS                        NAMES
ec8a9fca64b0        busybox:buildroot-2014.02        "sleep 360"         3 minutes ago       Up 3 minutes                                     ecs-sleep360-1-sleep-dc8dd4cdfcf593d07d00
58e68cc5bfc3        amazon/amazon-ecs-agent:latest   "/agent"            34 minutes ago      Up 34 minutes>51678/tcp   ecs-agent

See more details about your docker container with:

docker inspect ec8a9fca64b0

After 6 minutes of sleeping, the docker process should disappear from the "docker ps" listing.

More Examples

More interesting examples including Tasks that link together a number of containers are contained in the videos linked to above.