Good advice on running large-scale database stress tests
I’ve been bitten by poor key distribution in tests in the past, so this is spot on: ‘I’d run it with Zipfian, Pareto, and Dirac delta distributions, and I’d choose read-modify-write transactions.’ And of course, a dataset bigger than all combined RAM. Also: http://smalldatum.blogspot.ie/2014/04/biebermarks.html — the “Biebermark”, where just a single row out of the entire db is contended on in a read/modify/write transaction: “the inspiration for this is maintaining counts for [highly contended] popular entities like Justin Bieber and One Direction.”
(tags: biebermark benchmarks testing performance stress-tests databases storage mongodb innodb foundationdb aphyr measurement distributions keys zipfian)