Tuesday, February 7, 2012

IMVU Continuous Deployment / Intermittent Test Failure

Timothy Fitz from IMVU on Continuous Deployment:
http://timothyfitz.wordpress.com/2009/02/08/continuous-deployment/

Timothy Fitz follow-up post about the organic evolution of their build-bot:
http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment-at-imvu-doing-the-impossible-fifty-times-a-day/

IMVU
  • $1m/month revenue
BuildBot
  • 9min test suite sharded across 30-40 machines
  • up to 6 deploys an hour
  • 15,000 test cases
  • rock solid one-in-a-million-or-better tests that drive Internet Explorer to click ajax frontend buttons executing backend apache, php, memcache, mysql, java and solr
  • imvu_push script pushes rsyncs to 100s machines
  • symlink switches small subset machines live - which are monitored for 5 minutes
  • if good, rest are symlinked live
  • fixed queue of 5 copies of website on each frontend
  • schema changes done out of band

Buildbot running our tests sharded across 36 machines.

John Watte from IMVU on handling intermittent test failures:
http://engineering.imvu.com/2011/01/19/buildbot-and-intermittent-tests/

  • 40,000 tests
  • informal goal: run all tests in 12 minutes or less (can build 5 times an hour)
  • intermittent failure will often not fail next time
  • failed test is retested on another machine, if good, build is green