As always, this kind of content is interesting to read. Short summary: a slow autoscaling of a AWS managed network service led to interesting cascading failures