AWS re:Invent 2015 | (CMP406) Amazon ECS at Coursera – YouTube
Coursera are running user-submitted code in ECS! interesting stuff about how they use Docker security/resource-limiting features, forking the ecs-agent code, to run user-submitted code. :O
(tags: coursera user-submitted-code sandboxing docker security ecs aws resource-limits ops)
How both TCP and Ethernet checksums fail
At Twitter, a team had a unusual failure where corrupt data ended up in memcache. The root cause appears to have been a switch that was corrupting packets. Most packets were being dropped and the throughput was much lower than normal, but some were still making it through. The hypothesis is that occasionally the corrupt packets had valid TCP and Ethernet checksums. One “lucky” packet stored corrupt data in memcache. Even after the switch was replaced, the errors continued until the cache was cleared.
YA occurrence of this bug. When it happens, it tends to _really_ screw things up, because it’s so rare — we had monitoring for this in Amazon, and when it occurred, it overwhelmingly occurred due to host-level kernel/libc/RAM issues rather than stuff in the network. Amazon design principles were to add app-level checksumming throughout, which of course catches the lot.(tags: networking tcp ip twitter ethernet checksums packets memcached)
Designing the Spotify perimeter
How Spotify use nginx as a frontline for their sites and services
(tags: scaling spotify nginx ops architecture ssl tls http frontline security)