Update on canary - how has our release impact changed?
We had two primary targets for our canary process - our monolith has a main web app and a main api - deployments to those two using normal deploy processes cause 'release impact'. Namely, all our systems go wonky and the scope of 'what gets disrupted' is sometimes even kinda unknown (or rather, has never been tabulated).
For the backstory, check out the other two posts on this topic:
For the backstory, check out the other two posts on this topic:
- http://blog.practicaltech.ca/2017/08/canary-deployments-of-iis-using-octopus.html
- http://blog.practicaltech.ca/2017/09/detailed-review-of-our-canary-iis-aws.html
TL;DR
- Customer impact with canary is now 0 (barring a broken deploy)
- We can now release at will instead of inside low-throughput release windows
- We have fast feedback via metrics that will automatically revert a failed deployment
- The actual deploy process is longer, but that's ok - because we don't have a time restriction
Some definitions...
- Impact window (length of release impact) - how long are users and systems impacted negatively, specifically because of our new code going out (releasing (ok technically deploying))
- Outage - the application is completely unavailable to serve requests
- Average latency - measured from IIS responses, using the field 'time_taken', average
- It's really just for illustration
- Percentiles are measured from 'time_taken'
- I chose those three because:
- 75th being high means things are pretty bad
- 95th is the one I've seen recommended as 'target to hit'
- 99th highlights weird horrors
API Release Impact
- Impact window reduction: 150s -> 30s
- Outage reduction: 30s -> 0s
- Average latency during deploy: 8141ms -> 575ms
- 75th percentile reduction: 13752ms -> 575ms
- 95th percentile reduction: 19600ms -> 4350ms
- 99th percentile reduction: 20175ms -> 5900ms
Normal Deploy - API
Canary Deploy - API
Web Release Impact
- Impact window reduction: 60s -> 0s
- Outage reduction: 20s -> 0s
- Average latency during deploy: 4200ms -> 80ms
- 75th percentile reduction: 7150ms -> 42ms
- 95th percentile reduction: 14600ms -> 216ms
- 99th percentile reduction: 20600ms -> 1020ms
Comments
Post a Comment