Justin's Linklog – Page 47 – (Things I found interesting recently.)

Links for 2015-10-27

Published October 27, 2015

The Okinawa missiles of October | Bulletin of the Atomic Scientists

‘By Bordne’s account, at the height of the Cuban Missile Crisis, Air Force crews on Okinawa were ordered to launch 32 missiles, each carrying a large nuclear warhead. Only caution and the common sense and decisive action of the line personnel receiving those orders prevented the launches—and averted the nuclear war that most likely would have ensued.’

(tags: okinawa nukes launch-codes pal cold-war cuban-missile-crisis history accidents ui security horror via:mattblaze)
Amazon ECS CLI Tutorial – Amazon EC2 Container Service

super-basic ECS tutorial, using a docker-compose.yml to create a new ECS-managed service fleet

(tags: ecs cli linux aws ec2 hosting docker tutorials)
Net neutrality: EU votes in favour of Internet fast lanes and slow lanes | Ars Technica UK

:(
In the end, sheer political fatigue may have played a major part in undermining net neutrality in the EU. However, the battle is not quite over. As Anne Jellema, CEO of the Web Foundation, which was established by Berners-Lee in 2009, notes in her response to today’s EU vote: “The European Parliament is essentially tossing a hot potato to the Body of European Regulators, national regulators and the courts, who will have to decide how these spectacularly unclear rules will be implemented. The onus is now on these groups to heed the call of hundreds of thousands of concerned citizens and prevent a two-speed Internet.”

(tags: eu net-neutrality internet europe ep politics)

Links for 2015-10-23

Published October 23, 2015

Analysing user behaviour – from histograms to random forests (PyData) at PyCon Ireland 2015 | Lanyrd

Swrve’s own Dave Brodigan on game user-data analysis techniques:
The goal is to give the audience a roadmap for analysing user data using python friendly tools. I will touch on many aspects of the data science pipeline from data cleansing to building predictive data products at scale. I will start gently with pandas and dataframes and then discuss some machine learning techniques like kmeans and random forests in scikitlearn and then introduce Spark for doing it at scale. I will focus more on the use cases rather than detailed implementation. The talk will be informed by my experience and focus on user behaviour in games and mobile apps.

(tags: swrve talks user-data big-data spark hadoop machine-learning data-science)
fabio

fast, modern, zero-conf load balancing HTTP(S) router managed by consul; serves 15k reqs/sec, in Go, from eBay

(tags: load-balancing consul http https routing ebay go open-source fabio)

Links for 2015-10-22

Published October 22, 2015

How Netty is used at Layer

pretty conventional HTTP/1.1, WebSockets and HTTP/2 front-end services with modern Netty practices

(tags: netty http api-services coding java servers)
RentTheRunway’s Engineering Ladder

One of the best things about working at Amazon was having a clear, well-defined career progression, and it’s something that’s always been absent in startups. Career growth, levelling, and tech management is important, and also helps in hiring by providing clear levels. This is the RentTheRunway engineering ladder, Camille Fournier’s team, which they open sourced back in March 2015

(tags: engineering hiring management career renttherunway camille-fournier amazon startups career-growth levelling ladder)

Links for 2015-10-21

Published October 21, 2015

How a criminal ring defeated the secure chip-and-PIN credit cards | Ars Technica

Ingenious —
The stolen cards were still considered evidence, so the researchers couldn’t do a full tear-down or run any tests that would alter the data on the card, so they used X-ray scans to look at where the chip cards had been tampered with. They also analyzed the way the chips distributed electricity when in use and used read-only programs to see what information the cards sent to a Point of Sale (POS) terminal. According to the paper, the fraudsters were able to perform a man-in-the-middle attack by programming a second hobbyist chip called a FUN card to accept any PIN entry, and soldering that chip onto the card’s original chip. This increased the thickness of the chip from 0.4mm to 0.7mm, “making insertion into a PoS somewhat uneasy but perfectly feasible,” the researchers write. [….] The researchers explain that a typical EMV transaction involves three steps: card authentication, cardholder verification, and then transaction authorization. During a transaction using one of the altered cards, the original chip was allowed to respond with the card authentication as normal. Then, during card holder authentication, the POS system would ask for a user’s PIN, the thief would respond with any PIN, and the FUN card would step in and send the POS the code indicating that it was ok to proceed with the transaction because the PIN checked out. During the final transaction authentication phase, the FUN card would relay the transaction data between the POS and the original chip, sending the issuing bank an authorization request cryptogram which the card issuer uses to tell the POS system whether to accept the transaction or not.

(tags: security chip-and-pin hacking pos emv transactions credit-cards debit-cards hardware chips pin fun-cards smartcards)
How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code

using Spark, Tesseract, HBase, Solr and Leptonica. Actually pretty feasible

(tags: spark tesseract hbase solr leptonica pdfs scanning cloudera hadoop architecture)
Existential Consistency: Measuring and Understanding Consistency at Facebook

The metric is termed ?(P)-consistency, and is actually very simple. A read for the same data is sent to all replicas in P, and ?(P)-consistency is defined as the frequency with which that read returns the same result from all replicas. ?(G)-consistency applies this metric globally, and ?(R)-consistency applies it within a region (cluster). Facebook have been tracking this metric in production since 2012.

(tags: facebook eventual-consistency consistency metrics papers cap distributed-computing)
Holistic Configuration Management at Facebook

How FB push config changes from Git (where it is code reviewed, version controlled, and history tracked with strong auth) to Zeus (their Zookeeper fork) and from there to live production servers.

(tags: facebook configuration zookeeper git ops architecture)
Hyperscan

a high-performance multiple regex matching library. Hyperscan uses hybrid automata techniques to allow simultaneous matching of large numbers (up to tens of thousands) of regular expressions and for the matching of regular expressions across streams of data.
Via Tony Finch
(tags: via:fanf regexps regex dpi hyperscan dfa nfa hybrid-automata text-matching matching text strings streams)

Links for 2015-10-20

Published October 20, 2015

Hologram

Hologram exposes an imitation of the EC2 instance metadata service on developer workstations that supports the [IAM Roles] temporary credentials workflow. It is accessible via the same HTTP endpoint to calling SDKs, so your code can use the same process in both development and production. The keys that Hologram provisions are temporary, so EC2 access can be centrally controlled without direct administrative access to developer workstations.

(tags: iam roles ec2 authorization aws adroll open-source cli osx coding dev)

Links for 2015-10-18

Published October 18, 2015

AWS re:Invent 2015 Video & Slide Presentation Links with Easy Index

Andrew Spyker’s roundup:
my quick index of all re:Invent sessions. Please wait for a few days and I’ll keep running the tool to fill in the index. It usually takes Amazon a few weeks to fully upload all the videos and slideshares.
Pretty definitive, full text descriptions of all sessions (and there are an awful lot of ’em).
(tags: aws reinvent andrew-spyker scraping slides presentations ec2 video)
(ARC308) The Serverless Company: Using AWS Lambda

Describing PlayOn! Sports’ Lambda setup. Sounds pretty productionizable

(tags: ops lambda aws reinvent slides architecture)

Links for 2015-10-16

Published October 16, 2015

Your Relative’s DNA Could Turn You Into A Suspect

Familial DNA searching has massive false positives, but is being used to tag suspects:
The bewildered Usry soon learned that he was a suspect in the 1996 murder of an Idaho Falls teenager named Angie Dodge. Though a man had been convicted of that crime after giving an iffy confession, his DNA didn’t match what was found at the crime scene. Detectives had focused on Usry after running a familial DNA search, a technique that allows investigators to identify suspects who don’t have DNA in a law enforcement database but whose close relatives have had their genetic profiles cataloged. In Usry’s case the crime scene DNA bore numerous similarities to that of Usry’s father, who years earlier had donated a DNA sample to a genealogy project through his Mormon church in Mississippi. That project’s database was later purchased by Ancestry, which made it publicly searchable—a decision that didn’t take into account the possibility that cops might someday use it to hunt for genetic leads. Usry, whose story was first reported in The New Orleans Advocate, was finally cleared after a nerve-racking 33-day wait — the DNA extracted from his cheek cells didn’t match that of Dodge’s killer, whom detectives still seek. But the fact that he fell under suspicion in the first place is the latest sign that it’s time to set ground rules for familial DNA searching, before misuse of the imperfect technology starts ruining lives.

(tags: dna familial-dna false-positives law crime idaho murder mormon genealogy ancestry.com databases biometrics privacy genes)

Links for 2015-10-15

Published October 15, 2015

Cluster benchmark: Scylla vs Cassandra

ScyllaDB (the C* clone in C++) is now actually looking promising — still need more reassurance about its consistency/reliabilty side though

(tags: scylla databases storage cassandra nosql)
_What We Know About Spreadsheet Errors_ [paper]

As we will see below, there has long been ample evidence that errors in spreadsheets are pandemic. Spreadsheets, even after careful development, contain errors in one percent or more of all formula cells. In large spreadsheets with thousands of formulas, there will be dozens of undetected errors. Even significant errors may go undetected because formal testing in spreadsheet development is rare and because even serious errors may not be apparent.

(tags: business coding maths excel spreadsheets errors formulas error-rate)
Defending Your Time

great post from Ross Duggan on avoiding developer burnout

(tags: coding burnout productivity work)
How is NSA breaking so much crypto?

If a client and server are speaking Diffie-Hellman, they first need to agree on a large prime number with a particular form. There seemed to be no reason why everyone couldn’t just use the same prime, and, in fact, many applications tend to use standardized or hard-coded primes. But there was a very important detail that got lost in translation between the mathematicians and the practitioners: an adversary can perform a single enormous computation to “crack” a particular prime, then easily break any individual connection that uses that prime. How enormous a computation, you ask? Possibly a technical feat on a scale (relative to the state of computing at the time) not seen since the Enigma cryptanalysis during World War II. Even estimating the difficulty is tricky, due to the complexity of the algorithm involved, but our paper gives some conservative estimates. For the most common strength of Diffie-Hellman (1024 bits), it would cost a few hundred million dollars to build a machine, based on special purpose hardware, that would be able to crack one Diffie-Hellman prime every year. Would this be worth it for an intelligence agency? Since a handful of primes are so widely reused, the payoff, in terms of connections they could decrypt, would be enormous. Breaking a single, common 1024-bit prime would allow NSA to passively decrypt connections to two-thirds of VPNs and a quarter of all SSH servers globally. Breaking a second 1024-bit prime would allow passive eavesdropping on connections to nearly 20% of the top million HTTPS websites. In other words, a one-time investment in massive computation would make it possible to eavesdrop on trillions of encrypted connections.
(via Eric)
(tags: via:eric encryption privacy security nsa crypto)

Links for 2015-10-14

Published October 14, 2015

AWS re:Invent 2015 | (CMP406) Amazon ECS at Coursera – YouTube

Coursera are running user-submitted code in ECS! interesting stuff about how they use Docker security/resource-limiting features, forking the ecs-agent code, to run user-submitted code. :O

(tags: coursera user-submitted-code sandboxing docker security ecs aws resource-limits ops)
How both TCP and Ethernet checksums fail

At Twitter, a team had a unusual failure where corrupt data ended up in memcache. The root cause appears to have been a switch that was corrupting packets. Most packets were being dropped and the throughput was much lower than normal, but some were still making it through. The hypothesis is that occasionally the corrupt packets had valid TCP and Ethernet checksums. One “lucky” packet stored corrupt data in memcache. Even after the switch was replaced, the errors continued until the cache was cleared.
YA occurrence of this bug. When it happens, it tends to _really_ screw things up, because it’s so rare — we had monitoring for this in Amazon, and when it occurred, it overwhelmingly occurred due to host-level kernel/libc/RAM issues rather than stuff in the network. Amazon design principles were to add app-level checksumming throughout, which of course catches the lot.
(tags: networking tcp ip twitter ethernet checksums packets memcached)
Designing the Spotify perimeter

How Spotify use nginx as a frontline for their sites and services

(tags: scaling spotify nginx ops architecture ssl tls http frontline security)

Links for 2015-10-13

Published October 13, 2015

Chromecast Speakers

Supports Spotify — totally getting one of these

(tags: spotify speakers music home google gadgets toget)
Where do ‘mama’/’papa’ words come from?

The sounds came first — as experiments in vocalization — and parents adopted them as pet names for themselves. If you open your mouth and make a sound, it will probably be an open vowel like /a/ unless you move your tongue or lips. The easiest consonants are perhaps the bilabials /m/, /p/, and /b/, requiring no movement of the tongue, followed by consonants made by raising the front of the tongue: /d/, /t/, and /n/. Add a dash of reduplication, and you get mama, papa, baba, dada, tata, nana. That such words refer to people (typically parents or other guardians) is something we have imposed on the sounds and incorporated into our languages and cultures; the meanings don’t inhere in the sounds as uttered by babies, which are more likely calls for food or attention.

(tags: sounds voice speech babies kids phonetics linguist language)
remind101/conveyor

‘A fast build system for Docker images’, open source, in Go, hooks into Github

(tags: build ci docker github go)
England opens up 11TB of LiDAR data covering the entire country as open data

All 11 terabytes of our LIDAR data (that’s roughly equivalent to 2,750,000 MP3 songs) will eventually be available through our new Open LIDAR portal under an Open Government Licence, allowing it to be used for any purpose. We hope that by giving free access to our data businesses and local communities will develop innovative solutions to benefit the environment, grow our thriving rural economy, and boost our world-leading food and farming industry. The possibilities are endless and we hope that making LIDAR data open will be a catalyst for new ideas and innovation.
Are you reading, Ordnance Survey Ireland?
(tags: data maps uk lidar mapping geodata open-data ogl)

Links for 2015-10-12

Published October 12, 2015

SuperChief: From Apache Storm to In-House Distributed Stream Processing

Another sorry tale of Storm issues:
Storm has been successful at Librato, but we experienced many of the limitations cited in the Twitter Heron: Stream Processing at Scale paper and outlined here by Adrian Colyer, including: Inability to isolate, reason about, or debug performance issues due to the worker/executor/task paradigm. This led to building and configuring clusters specifically designed to attempt to mitigate these problems (i.e., separate clusters per topology, only running a worker per server.), which added additional complexity to development and operations and also led to over-provisioning. Ability of tasks to move around led to difficult to trace performance problems. Storm’s work provisioning logic led to some tasks serving more Kafka partitions than others. This in turn created latency and performance issues that were difficult to reason about. The initial solution was to over-provision in an attempt to get a better hashing/balancing of work, but eventually we just replaced the work allocation logic. Due to Storm’s architecture, it was very difficult to get a stack trace or heap dump because the processes that managed workers (Storm supervisor) would often forcefully kill a Java process while it was being investigated in this way. The propensity for unexpected and subsequently unhandled exceptions to take down an entire worker led to additional defensive verbose error handling everywhere. This nasty bug STORM-404 coupled with the aforementioned fact that a single exception can take down a worker led to several cascading failures in production, taking down entire topologies until we upgraded to 0.9.4. Additionally, we found the performance we were getting from Storm for the amount of money we were spending on infrastructure was not in line with our expectations. Much of this is due to the fact that, depending upon how your topology is designed, a single tuple may make multiple hops across JVMs, and this is very expensive. For example, in our time series aggregation topologies a single tuple may be serialized/deserialized and shipped across the wire 3-4 times as it progresses through the processing pipeline.

(tags: scalability storm kafka librato architecture heron ops)
librato/disco-java

Librato’s service discovery library using Zookeeper (so strongly consistent, but with the ZK downside that an AZ outage can stall service discovery updates region-wide)

(tags: zookeeper service-discovery librato java open-source load-balancing)
Tech companies like Facebook not above the law, says Max Schrems

“Big companies didn’t only rely on safe harbour: they also rely on binding corporate rules and standard contractual clauses. But it’s interesting that the court decided the case on fundamental rights grounds: so it doesn’t matter remotely what ground you transfer on, if that process is still illegal under 7 and 8 of charter, it can’t be done.”
Also:
“Ireland has no interest in doing its job, and will continue not to, forever. Clearly it’s an investment issue – but overall the policy is: we don’t regulate companies here. The cost of challenging any of this in the courts is prohibitive. And the people don’t seem to care.”
:(
(tags: ireland guardian max-schrems privacy surveillance safe-harbor eu us nsa dpc data-protection)
After Bara: All your (Data)base are belong to us

Sounds like the CJEU’s Bara decision may cause problems for the Irish government’s wilful data-sharing:
Articles 10, 11 and 13 of Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995, on the protection of individuals with regard to the processing of personal data and on the free movement of such data, must be interpreted as precluding national measures, such as those at issue in the main proceedings, which allow a public administrative body of a Member State to transfer personal data to another public administrative body and their subsequent processing, without the data subjects having been informed of that transfer or processing.

(tags: data databases bara cjeu eu law privacy data-protection)

Links for 2015-10-11

Published October 11, 2015

Dublin-traceroute

uses the techniques invented by the authors of Paris-traceroute to enumerate the paths of ECMP flow-based load balancing, but introduces a new technique for NAT detection.
handy. written by AWS SDE Andrea Barberio!
(tags: internet tracing traceroute networking ecmp nat ip)
GZinga

‘Seekable and Splittable Gzip’, from eBay

(tags: ebay gzip compression seeking streams splitting logs gzinga)

Links for 2015-10-10

Published October 10, 2015

Outage postmortem (2015-10-08 UTC) : Stripe: Help & Support

There was a breakdown in communication between the developer who requested the index migration and the database operator who deleted the old index. Instead of working on the migration together, they communicated in an implicit way through flawed tooling. The dashboard that surfaced the migration request was missing important context: the reason for the requested deletion, the dependency on another index’s creation, and the criticality of the index for API traffic. Indeed, the database operator didn’t have a way to check whether the index had recently been used for a query.
Good demo of how the Etsy-style chatops deployment approach would have helped avoid this risk.
(tags: stripe postmortem outages databases indexes deployment chatops deploy ops)
net.wars: Unsafe harbor

Wendy Grossman on where the Safe Harbor decision is leading.
One clause would require European companies to tell their relevant data protection authorities if they are being compelled to turn over data – even if they have been forbidden to disclose this under US law. Sounds nice, but doesn’t mobilize the rock or soften the hard place, since companies will still have to pick a law to violate. I imagine the internal discussions there revolving around two questions: which violation is less likely to land the CEO in jail and which set of fines can we afford?
(via Simon McGarr)
(tags: safe-harbor privacy law us eu surveillance wendy-grossman via:tupp_ed)
CHICKEN COOP & RUN

bookmarking as a potential future addition to the back garden

(tags: chickens pets food garden ebay)

Links for 2015-10-09

Published October 9, 2015

Rebuilding Our Infrastructure with Docker, ECS, and Terraform

Good writeup of current best practices for a production AWS architecture

(tags: aws ops docker ecs ec2 prod terraform segment via:marc)
The Totally Managed Analytics Pipeline: Segment, Lambda, and Dynamo

notable mainly for the details of Terraform support for Lambda: that’s a significant improvement to Lambda’s production-readiness

(tags: aws pipelines data streaming lambda dynamodb analytics terraform ops)
Gene patents probably dead worldwide following Australian court decision

The court based its reasoning on the fact that, although an isolated gene such as BRCA1 was “a product of human action, it was the existence of the information stored in the relevant sequences that was an essential element of the invention as claimed.” Since the information stored in the DNA as a sequence of nucleotides was a product of nature, it did not require human action to bring it into existence, and therefore could not be patented.
Via Tony Finch.
(tags: via:fanf australia genetics law ipr medicine ip patents)
Baker Street

client-side ‘service discovery and routing system for microservices’ — another Smartstack, then

(tags: python router smartstack baker-street microservices service-discovery routing load-balancing http)
How IFTTT develop with Docker

ugh, quite a bit of complexity here

(tags: docker osx dev ops building coding ifttt dns dnsmasq)

Links for 2015-10-08

Published October 8, 2015

Fuzzing Raft for Fun and Publication

Good intro to fuzz-testing a distributed system; I’ve had great results using similar approaches in unit tests

(tags: fuzzing fuzz-testing testing raft akka tests)
EC2 Spot Blocks for Defined-Duration Workloads

you can now launch Spot instances that will run continuously for a finite duration (1 to 6 hours). Pricing is based on the requested duration and the available capacity, and is typically 30% to 45% less than On-Demand.

(tags: ec2 aws spot-instances spot pricing time)
The Surveillance Elephant in the Room…

Very perceptive post on the next steps for safe harbor, post-Schrems.
And behind that elephant there are other elephants: if US surveillance and surveillance law is a problem, then what about UK surveillance? Is GCHQ any less intrusive than the NSA? It does not seem so – and this puts even more pressure on the current reviews of UK surveillance law taking place. If, as many predict, the forthcoming Investigatory Powers Bill will be even more intrusive and extensive than current UK surveillance laws this will put the UK in a position that could rapidly become untenable. If the UK decides to leave the EU, will that mean that the UK is not considered a safe place for European data? Right now that seems the only logical conclusion – but the ramifications for UK businesses could be huge. [….] What happens next, therefore, is hard to foresee. What cannot be done, however, is to ignore the elephant in the room. The issue of surveillance has to be taken on. The conflict between that surveillance and fundamental human rights is not a merely semantic one, or one for lawyers and academics, it’s a real one. In the words of historian and philosopher Quentin Skinner “the current situation seems to me untenable in a democratic society.” The conflict over Safe Harbor is in many ways just a symptom of that far bigger problem. The biggest elephant of all.

(tags: ec cjeu surveillance safe-harbor schrems privacy europe us uk gchq nsa)
ECJ ruling on Irish privacy case has huge significance

The only current way to comply with EU law, the judgment indicates, is to keep EU data within the EU. Whether those data can be safely managed within facilities run by US companies will not be determined until the US rules on an ongoing Microsoft case. Microsoft stands in contempt of court right now for refusing to hand over to US authorities, emails held in its Irish data centre. This case will surely go to the Supreme Court and will be an extremely important determination for the cloud business, and any company or individual using data centre storage. If Microsoft loses, US multinationals will be left scrambling to somehow, legally firewall off their EU-based data centres from US government reach.
(cough, Amazon)
(tags: aws hosting eu privacy surveillance gchq nsa microsoft ireland)

Links for 2015-10-07

Published October 7, 2015

Elasticsearch and data loss

“@alexbfree @ThijsFeryn [ElasticSearch is] fine as long as data loss is acceptable. https://aphyr.com/posts/317-call-me-maybe-elasticsearch . We lose ~1% of all writes on average.”

(tags: elasticsearch data-loss reliability data search aphyr jepsen testing distributed-systems ops)
Daragh O’Brien on the CJEU judgement on Safe Harbor

Many organisations I’ve spoken to have had the cunning plan of adopting model contract clauses as their fall back position to replace their reliance on Safe Harbor. [….] The best that can be said for Model Clauses is that they haven’t been struck down by the CJEU. Yet.

(tags: model-clauses cjeu eu europe safe-harbor us nsa surveillance privacy law)
5 takeaways from the death of safe harbor – POLITICO

Reacting to the ruling, the [EC] stressed that data transfers between the U.S. and Europe can continue on the basis of other legal mechanisms. A lot rides on what steps the Commission and national data protection supervisors take in response. “It is crucial for legal certainty that the EC sends a clear signal,” said Nauwelaerts. That could involve providing a timeline for concluding an agreement with U.S. authorities, together with a commitment from national data protection authorities not to block data transfers while negotiations are on-going, he explained.

(tags: safe-harbor data privacy eu ec snowden law us)
The New InfluxDB Storage Engine: A Time Structured Merge Tree

The new engine has similarities with LSM Trees (like LevelDB and Cassandra’s underlying storage). It has a write ahead log, index files that are read only, and it occasionally performs compactions to combine index files. We’re calling it a Time Structured Merge Tree because the index files keep contiguous blocks of time and the compactions merge those blocks into larger blocks of time. Compression of the data improves as the index files are compacted. Once a shard becomes cold for writes it will be compacted into as few files as possible, which yield the best compression.

(tags: influxdb storage lsm-trees leveldb tsm-trees data-structures algorithms time-series tsd compression)

Links for 2015-10-06

Published October 6, 2015

Marvin.ie: Order Takeaway Food Online

new Dublin delivery service takes Bitcoin?!

(tags: bitcoin food delivery takeaway payment ireland dublin wtf)
qp tries: smaller and faster than crit-bit tries

interesting new data structure from Tony Finch. “Some simple benchmarks say qp tries have about 1/3 less memory overhead and are about 10% faster than crit-bit tries.”

(tags: crit-bit popcount bits bitmaps tries data-structures via:fanf qp-tries crit-bit-tries hacks memory)
Schneier on Automatic Face Recognition and Surveillance

When we talk about surveillance, we tend to concentrate on the problems of data collection: CCTV cameras, tagged photos, purchasing habits, our writings on sites like Facebook and Twitter. We think much less about data analysis. But effective and pervasive surveillance is just as much about analysis. It’s sustained by a combination of cheap and ubiquitous cameras, tagged photo databases, commercial databases of our actions that reveal our habits and personalities, and – most of all – fast and accurate face recognition software. Don’t expect to have access to this technology for yourself anytime soon. This is not facial recognition for all. It’s just for those who can either demand or pay for access to the required technologies – most importantly, the tagged photo databases. And while we can easily imagine how this might be misused in a totalitarian country, there are dangers in free societies as well. Without meaningful regulation, we’re moving into a world where governments and corporations will be able to identify people both in real time and backwards in time, remotely and in secret, without consent or recourse. Despite protests from industry, we need to regulate this budding industry. We need limitations on how our images can be collected without our knowledge or consent, and on how they can be used. The technologies aren’t going away, and we can’t uninvent these capabilities. But we can ensure that they’re used ethically and responsibly, and not just as a mechanism to increase police and corporate power over us.

(tags: privacy regulation surveillance bruce-schneier faces face-recognition machine-learning ai cctv photos)

Links for 2015-10-05

Published October 5, 2015

In China, Your Credit Score Is Now Affected By Your Political Opinions – And Your Friends’ Political Opinions

China just introduced a universal credit score, where everybody is measured as a number between 350 and 950. But this credit score isn’t just affected by how well you manage credit – it also reflects how well your political opinions are in line with Chinese official opinions, and whether your friends’ are, too.
Measuring using online mass surveillance, naturally. This may be the most dystopian thing I’ve heard in a while….
(tags: via:raycorrigan dystopia china privacy mass-surveillance politics credit credit-score loans opinions)
Brand New Retro – The Book, November 2015

YESSSS. Joe and Brian have delivered — going to be giving a lot of copies of this for xmas ;)

(tags: brand-new-retro blogs friends retro history dublin ireland books toget)
Google Cloud Shell

your command line environment in the [Google] Cloud. This feature enables you to connect to a shell environment on a virtual machine, pre-loaded with the tools you need to easily run commands to develop, deploy and manage your projects. Currently, Cloud Shell is an f1-micro Google Compute Engine machine that exposes a Debian-based development environment. You are also assigned 5 GB of standard persistent disk space as the home disk so you can store files between sessions.
It’s also free. This is a great idea — handy both for beginners getting to grips with GoogCloud and for experts looking for a quite dev env to hack with. I wish AWS had something similar.
(tags: google cloud shell google-cloud gcs gce cli tools)
Amaro: A Bittersweet Obsession – Food & Wine

“A Neapolitan-American friend of mine, who’s in his mid-fifties, fondly remembers how his mother used to serve him an espresso with Fernet Branca and an egg yolk every morning before he went off to elementary school.”

(tags: amari amaro bitters digestifs booze cocktails recipes)

Links for 2015-10-02

Published October 2, 2015

Dash Wallets

come recommended by http://gearmoose.com/the-ten-best-minimalist-wallets-a-recap/ , looks pretty nice

(tags: wallets minimalism daily-carry pockets slimline gear toget)
Notes on Startup Engineering Management for Young Bloods

Below is a list of some lessons I’ve learned as an startup engineering manager that are worth being told to a new manager. Some are subtle, and some are surprising, and this being human beings, some are inevitably controversial. This list is for the new head of engineering to guide their thinking about the job they are taking on. It’s not comprehensive, but it’s a good beginning. The best characteristic of this list is that it focuses on social problems with little discussion of technical problems a manager may run into. The social stuff is usually the hardest part of any software developer’s job, and of course this goes triply for engineering managers.

(tags: engineering management camille-fournier teams dev)
Further reading on just culture and blameless post mortems

Some bookmarks around post-mortem activity

(tags: post-mortems culture etsy rafe-colburn rc3 john-allspaw ops coes)

Links for 2015-10-01

Published October 1, 2015

Han Sung: Probably the Best Korean Food in Dublin

Han Sung is bizarrely located in the back of an Asian supermarket just off the Millennium Walk on Great Strand Street. […] You’d see this a lot in Korea, I ask, a restaurant in the back of a supermarket? Not really, no, he says.

(tags: restaurants food eating dublin supermarkets korean nom)
Behold: The Ultimate Crowdsourced Map of Punny Businesses in America | Atlas Obscura

“Spex in the City”, “Fidler on the Tooth”, “Sight For Four Eyes”, “Fried Egg I’m In Love”, “Lice Knowing You” and many more

(tags: business humor map geography usa puns)

Links for 2015-09-30

Published September 30, 2015

SQL on Kafka using PipelineDB

this is quite nice. PipelineDB allows direct hookup of a Kafka stream, and will ingest durably and reliably, and provide SQL views computed over a sliding window of the stream.

(tags: logging sql kafka pipelinedb streaming sliding-window databases search querying)
the impact of the economic crisis on public funding for universities in Europe

Ireland leading the pack with a drop of funding by 20% :(

(tags: universities ireland ucd tcd dcu funding public-funding europe history downturn)
CurrencyFair P2P International Money Transfers

recommended by Paul Hickey

(tags: via:phickey money money-transfer currency currency-conversion tools recommendations)
How the banks ignored the lessons of the crash

First of all, banks could be chopped up into units that can safely go bust – meaning they could never blackmail us again. Banks should not have multiple activities going on under one roof with inherent conflicts of interest. Banks should not be allowed to build, sell or own overly complex financial products – clients should be able to comprehend what they buy and investors understand the balance sheet. Finally, the penalty should land on the same head as the bonus, meaning nobody should have more reason to lie awake at night worrying over the risks to the bank’s capital or reputation than the bankers themselves. You might expect all major political parties to have come out by now with their vision of a stable and productive financial sector. But this is not what has happened.

(tags: banks banking guardian finance europe eu crash history)
The price of the Internet of Things will be a vague dread of a malicious world

So the fact is that our experience of the world will increasingly come to reflect our experience of our computers and of the internet itself (not surprisingly, as it’ll be infused with both). Just as any user feels their computer to be a fairly unpredictable device full of programs they’ve never installed doing unknown things to which they’ve never agreed to benefit companies they’ve never heard of, inefficiently at best and actively malignant at worst (but how would you now?), cars, street lights, and even buildings will behave in the same vaguely suspicious way. Is your self-driving car deliberately slowing down to give priority to the higher-priced models? Is your green A/C really less efficient with a thermostat from a different company, or it’s just not trying as hard? And your tv is supposed to only use its camera to follow your gestural commands, but it’s a bit suspicious how it always offers Disney downloads when your children are sitting in front of it. None of those things are likely to be legal, but they are going to be profitable, and, with objects working actively to hide them from the government, not to mention from you, they’ll be hard to catch.

(tags: culture bots criticism ieet iot internet-of-things law regulation open-source appliances)
excellent offline mapping app MAPS.ME goes open source

“MAPS.ME is an open source cross-platform offline maps application, built on top of crowd-sourced OpenStreetMap data. It was publicly released for iOS and Android.”

(tags: maps.me mapping maps open-source apache ios android mobile)

Links for 2015-09-29

Published September 29, 2015

Eircode cost the Irish government EUR38m

The C&AG has said it is not clear that the €38m scheme will achieve the data-matching benefits the Government had hoped.
Well, that’s putting it mildly.
(tags: eircode fail ireland costs money geo mapping geocoding)
Let a 1,000 flowers bloom. Then rip 999 of them out by the roots

The Twitter tech-debt story.
Somewhere along the way someone decided that it would be easier to convert the Birdcage to use Pants which had since learned how to build Scala and to deal with a maven-style layout. However at some point prior Pants been open sourced in throw it over the wall fashion and picked up by a few engineers at other companies, such as Square and Foursquare and moved forward. In the meantime, again because there weren’t enough people who’s job it was to take care of these things, Science was still on the original internally developed version and had in fact evolved independently of the open source version. However by the time we wanted to move Birdcage onto Pants, the open source version had moved ahead so that’s the one the Birdcage folks chose.
(cries)
(tags: tech-debt management twitter productivity engineering monorepo build-systems war-stories dev)
Cabel Sasser on Twitter: “From a “cash for phones” ATM in the mall (in maintenance mode): @daveaddey finds the most amazing UI ever created. http://t.co/0qKg68wHjQ ????”

Amazing. This is what happens when embedded software engineers make a UI, in my experience

(tags: embedded-software ui ux design graphics windows the-horror omgwtf atms)

Links for 2015-09-28

Published September 28, 2015

EPA opposed rules that would have exposed VW’s cheating

[…] Two months ago, the EPA opposed some proposed measures that would help potentially expose subversive code like the so-called “defeat device” software VW allegedly used by allowing consumers and researchers to legally reverse-engineer the code used in vehicles. EPA opposed this, ironically, because the agency felt that allowing people to examine the software code in vehicles would potentially allow car owners to alter the software in ways that would produce more emissions in violation of the Clean Air Act. The issue involves the 1998 Digital Millennium Copyright Act (DCMA), which prohibits anyone from working around “technological protection measures” that limit access to copyrighted works. The Library of Congress, which oversees copyrights, can issue exemptions to those prohibitions that would make it legal, for example, for researchers to examine the code to uncover security vulnerabilities.

(tags: dmca volkswagen vw law code open-source air-quality diesel cheating regulation us-politics)
From Radio to Porn, British Spies Track Web Users’ Online Identities

Inside KARMA POLICE, GCHQ’s mass-surveillance operation aimed to record the browsing habits of “every visible user on the internet”, including UK-to-UK internal traffic. more details on the other GCHQ mass surveillance projects at https://theintercept.com/gchq-appendix/

(tags: surveillance gchq security privacy law uk ireland karma-police snooping)
Streaming will soon pass traditional TV – Tech Insider

the percentage of people who say they stream video from services like Netflix, YouTube, and Hulu each day has increased dramatically over the last five years, from about 30% in 2010 to more than 50% this year. During the same period, the percentage of people who say they watch traditional TV […] has dropped by about 10%. When the beige line surpasses the purple line [looks like 2016], it will mean that more people are streaming each day than are watching traditional TV.

(tags: streaming hulu netflix tv television video youtube)
Is there a CAP theorem for Durability?

Marc Brooker with another thought-provoking blogpost

(tags: databases storage marc-brooker cap-theorem cap durability pacelc nosql)
ASCII art to PNG converter

(via Aman)

(tags: via:akohli graphics ascii-art ascii visualization text boxes diagrams)
Scale it to Billions — What They Don’t Tell you in the Cassandra README

large-scale C* tips

(tags: cassandra configuration tuning scale ops)
Introduction to HDFS Erasure Coding in Apache Hadoop

How Hadoop did EC. Erasure Coding support (“HDFS-EC”) is set to be released in Hadoop 3.0 apparently

(tags: erasure-coding reed-solomon algorithms hadoop hdfs cloudera raid storage)
Chaos Engineering Upgraded

some details on Netflix’s Chaos Monkey, Chaos Kong and other aspects of their availability/failover testing

(tags: architecture aws netflix ops chaos-monkey chaos-kong testing availability failover ha)
traefik

Træf?k is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease. It supports several backends (Docker , Mesos/Marathon, Consul, Etcd, Rest API, file…) to manage its configuration automatically and dynamically.
Hot-reloading is notably much easier than with nginx/haproxy.
(tags: proxy http proxying reverse-proxy traefik go ops)
muxy

a proxy that mucks with your system and application context, operating at Layers 4 and 7, allowing you to simulate common failure scenarios from the perspective of an application under test; such as an API or a web application. If you are building a distributed system, Muxy can help you test your resilience and fault tolerance patterns.

(tags: proxy distributed testing web http fault-tolerance failure injection tcp delay resilience error-handling)
Petabyte-Scale Data Pipelines with Docker, Luigi and Elastic Spot Instances — AdRoll

nice approach

(tags: data-pipelines docker luigi containers workflow)

Links for 2015-09-24

Published September 24, 2015

Byteman

a tool which simplifies tracing and testing of Java programs. Byteman allows you to insert extra Java code into your application, either as it is loaded during JVM startup or even after it has already started running. The injected code is allowed to access any of your data and call any application methods, including where they are private. You can inject code almost anywhere you want and there is no need to prepare the original source code in advance nor do you have to recompile, repackage or redeploy your application. In fact you can remove injected code and reinstall different code while the application continues to execute. The simplest use of Byteman is to install code which traces what your application is doing. This can be used for monitoring or debugging live deployments as well as for instrumenting code under test so that you can be sure it has operated correctly. By injecting code at very specific locations you can avoid the overheads which often arise when you switch on debug or product trace. Also, you decide what to trace when you run your application rather than when you write it so you don’t need 100% hindsight to be able to obtain the information you need.

(tags: tracing java byteman injection jvm ops debugging testing)
Henry Robinson on testing and fault discovery in distributed systems

‘Let’s talk about finding bugs in distributed systems for a bit. These chaos monkey-style fault testing systems are all well and good, but by being application independent they’re a very blunt instrument. Particularly they make it hard to search the fault space for bugs in a directed manner, because they don’t ‘know’ what the system is doing. Application-aware scripting of faults in a dist. systems seems to be rarely used, but allows you to directly stress problem areas. For example, if a bug manifests itself only when one RPC returns after some timeout, hard to narrow that down with iptables manipulation. But allow a script to hook into RPC invocations (and other trace points, like DTrace’s probes), and you can script very specific faults. That way you can simulate cross-system integration failures, *and* write reproducible tests for the bugs they expose! Anyhow, I’ve been doing this in Impala, and it’s been very helpful. Haven’t seen much evidence elsewhere.’

(tags: henry-robinson testing fault-discovery rpc dtrace tracing distributed-systems timeouts chaos-monkey impala)
The Best Bourbon Cocktail You’ve Never Heard Of

The “Paper Plane”, by Sam Ross of Chicago’s “Violet Hour”: .75 oz Bourbon .75 oz Aperol .75 oz Amaro Nonino .75 oz Fresh lemon juice ice-filled shaker, shake, strain.

(tags: bourbon drinks cocktails recipes aperol amaro-nonino lemon)
Seastar

C++ high-performance app framework; ‘currently focused on high-throughput, low-latency I/O intensive applications.’ Scylla (Cassandra-compatible NoSQL store) is written in this.

(tags: c++ opensource performance framework scylla seastar latency linux shared-nothing multicore)

Links for 2015-09-23

Published September 23, 2015

How VW tricked the EPA’s emissions testing system

In July 2015, CARB did some follow up testing and again the cars failed—the scrubber technology was present, but off most of the time. How this happened is pretty neat. Michigan’s Stefanopolou says computer sensors monitored the steering column. Under normal driving conditions, the column oscillates as the driver negotiates turns. But during emissions testing, the wheels of the car move, but the steering wheel doesn’t. That seems to have have been the signal for the “defeat device” to turn the catalytic scrubber up to full power, allowing the car to pass the test. Stefanopolou believes the emissions testing trick that VW used probably isn’t widespread in the automotive industry. Carmakers just don’t have many diesels on the road. And now that number may go down even more.
Depressing stuff — but at least they think VW’s fraud wasn’t widespread.
(tags: fraud volkswagen vw diesel emissions air-quality epa carb catalytic-converters testing)
EU court adviser: data-share deal with U.S. is invalid | Reuters

The Safe Harbor agreement does not do enough to protect EU citizen’s private information when it reached the United States, Yves Bot, Advocate General at the European Court of Justice (ECJ), said. While his opinions are not binding, they tend to be followed by the court’s judges, who are currently considering a complaint about the system in the wake of revelations from ex-National Security Agency contractor Edward Snowden of mass U.S. government surveillance.

(tags: safe-harbor law eu ec ecj snowden surveillance privacy us data max-schrems)
Summary of the Amazon DynamoDB Service Disruption and Related Impacts in the US-East Region

Painful to read, but: tl;dr: monitoring oversight, followed by a transient network glitch triggering IPC timeouts, which increased load due to lack of circuit breakers, creating a cascading failure

(tags: aws postmortem outages dynamodb ec2 post-mortems circuit-breakers monitoring)
What Happens Next Will Amaze You

Maciej Ceglowski’s latest talk, on ads, the web, Silicon Valley and government:
‘I went to school with Bill. He’s a nice guy. But making him immortal is not going to make life better for anyone in my city. It will just exacerbate the rent crisis.’

(tags: talks slides funny ads advertising internet web privacy surveillance maciej silicon-valley)
Frame of Reference and Roaring Bitmaps

interesting performance-oriented algorithm tweak from Elastic/Lucene

(tags: lucene elasticsearch performance optimization roaring-bitmaps bitmaps frame-of-reference integers algorithms)
Uber Goes Unconventional: Using Driver Phones as a Backup Datacenter – High Scalability

Initially I thought they were just tracking client state on the phone, but it actually sounds like they’re replicating other users’ state, too. Mad stuff! Must cost a fortune in additional data transfer costs…

(tags: scalability failover multi-dc uber replication state crdts)

Links for 2015-09-22

Published September 22, 2015

Brotli: a new compression algorithm for the internet from Google

While Zopfli is Deflate-compatible, Brotli is a whole new data format. This new format allows us to get 20–26% higher compression ratios over Zopfli. In our study ‘Comparison of Brotli, Deflate, Zopfli, LZMA, LZHAM and Bzip2 Compression Algorithms’ we show that Brotli is roughly as fast as zlib’s Deflate implementation. At the same time, it compresses slightly more densely than LZMA and bzip2 on the Canterbury corpus. The higher data density is achieved by a 2nd order context modeling, re-use of entropy codes, larger memory window of past data and joint distribution codes. Just like Zopfli, the new algorithm is named after Swiss bakery products. Brötli means ‘small bread’ in Swiss German.

(tags: brotli zopfli deflate gzip compression algorithms swiss google)

Links for 2015-09-21

Published September 21, 2015

Nelson recommends Ubiquiti

‘The key thing about Ubiquiti gear is the high quality radios and antennas. It just seems much more reliable than most consumer WiFi gear. Their airOS firmware is good too, it’s a bit complicated to set up but very capable and flexible. And in addition to normal 802.11n or 802.11ac they also have an optional proprietary TDMA protocol called airMax that’s designed for serving several long haul links from a single basestation. They’re mostly marketing to business customers but the equipment is sold retail and well documented for ordinary nerds to figure out.’

(tags: ubiquiti wifi wireless 802.11 via:nelson ethernet networking prosumer hardware wan)
httpry

a specialized packet sniffer designed for displaying and logging HTTP traffic. It is not intended to perform analysis itself, but to capture, parse, and log the traffic for later analysis. It can be run in real-time displaying the traffic as it is parsed, or as a daemon process that logs to an output file. It is written to be as lightweight and flexible as possible, so that it can be easily adaptable to different applications.
via Eoin Brazil
(tags: via:eoinbrazil httpry http networking tools ops testing tcpdump tracing)
ustwo Reimagines the In-Car Cluster

Designers behind the cult mobile game, Monument Valley, take on the legacy-bound in-car UI

(tags: ux ui cars driving safety ustwo monument-valley speed)
Little Drummer Boy Challenge

‘It’s very easy: So long as you don’t hear “The Little Drummer Boy,” you’re a contender. As soon as you hear it on the radio, on TV, in a store, wherever, you’re out.’

(tags: ldbc games funny xmas christmas music songs cheese)

Links for 2015-09-20

Published September 20, 2015

Geographically-accurate version of the London underground map

as Boing Boing says: ‘London’s subway system switched early to an abstract map (PDF), and it became a legendary work of design. It just published an internally-used geographic version of map (PDF), however, for the first time in a century—and it’s awesome.’

(tags: london maps mapping geography accuracy pdf subway underground)

Links for 2015-09-19

Published September 19, 2015

Critiki’s top 10 tiki bars in the world

not a one in Europe, of course! I need to hit up one of these sometime

(tags: tiki bars drinks polynesian midcentury trader-vic critiki)

Links for 2015-09-18

Published September 18, 2015

What is the fastest way to clone a git repository over a fast network connection? – Stack Overflow

“git bundle create” — neat trick

(tags: git distribution copying git-bundle cli)
Retina

a regex-based, Turing-complete programming language. It’s main feature is taking some text via standard input and repeatedly applying regex operations to it (e.g. matching, splitting, and most of all replacing). Under the hood, it uses .NET’s regex engine, which means that both the .NET flavour and the ECMAScript flavour are available.
Reminscent of sed(1); see http://codegolf.stackexchange.com/a/58166 for an example Retina program
(tags: retina regexps regexes regular-expressions coding hacks dot-net languages)
Time on multi-core, multi-socket servers

Nice update on the state of System.currentTimeMillis() and System.nanoTime() in javaland. Bottom line: both are non-monotonic nowadays:
The conclusion I’ve reached is that except for the special case of using nanoTime() in micro benchmarks, you may as well stick to currentTimeMillis() —knowing that it may sporadically jump forwards or backwards. Because if you switched to nanoTime(), you don’t get any monotonicity guarantees, it doesn’t relate to human time any more —and may be more likely to lead you into writing code which assumes a fast call with consistent, monotonic results.

(tags: java time monotonic sequencing nanotime timers jvm multicore distributed-computing)
Anatomy of a Modern Production Stack

Interesting post, but I think it falls into a common trap for the xoogler or ex-Amazonian — assuming that all the BigCo mod cons are required to operate, when some are luxuries than can be skipped for a few years to get some real products built

(tags: architecture ops stack docker containerization deployment containers rkt coreos prod monitoring xooglers)

Links for 2015-09-16

Published September 16, 2015

How We Use AWS Lambda for Rapidly Intensifying Workloads · CloudSploit

impressive — pretty much the entire workload is run from Lambda here

(tags: lambda aws ec2 autoscaling cloudsploit)
Introducing the Software Testing Cupcake (Anti-Pattern)

good post on the risks of overweighting towards manual testing rather than low-level automated tests (via Tony Byrne)

(tags: qa testing via:tonyjbyrne tests antipatterns dev)

Links for 2015-09-14

Published September 14, 2015

Kate Heddleston: How Our Engineering Environments Are Killing Diversity

‘[There are] several problem areas for [diversity in] engineering environments and ways to start fixing them. The problems we face aren’t devoid of solutions; there are a lot of things that companies, teams, and individuals can do to fix problems in their work environment. For the month of March, I will be posting detailed articles about the problem areas I will cover in my talk: argument cultures, feedback, promotions, employee on-boarding, benefits, safety, engineering process, and environment adaptation.’ via Baron Schwartz.

(tags: via:xaprb culture tech diversity sexism feminism engineering work workplaces feedback)
Michael Kagan | Prints

‘Heavily tinted blue paintings form space stations, spacesuits, and rockets just after blast. Michael Kagan paints these large-scale works to celebrate the man-made object—machinery that both protects and holds the possibility of instantly killing those that operate the equipment from the inside. To paint the large works, Kagan utilizes an impasto technique with thick strokes that are deliberate and unique, showing an aggression in his application of oil paint on linen. The New York-based artist focuses on iconic images in his practice, switching back and forth between abstract and representational styles. “The painting is finished when it can fall apart and come back together depending on how it is read and the closeness to the work,” said Kagan about his work. “Each painting is an image, a snapshot, a flash moment, a quick read that is locked into memory by the iconic silhouettes.”’ Via http://www.thisiscolossal.com/2015/08/michael-kagens-space-paintings/

(tags: paintings prints art michael-kagan space abstract-art tobuy)
Dark corners of Unicode

I’m assuming, if you are on the Internet and reading kind of a nerdy blog, that you know what Unicode is. At the very least, you have a very general understanding of it — maybe “it’s what gives us emoji”. That’s about as far as most people’s understanding extends, in my experience, even among programmers. And that’s a tragedy, because Unicode has a lot of… ah, depth to it. Not to say that Unicode is a terrible disaster — more that human language is a terrible disaster, and anything with the lofty goals of representing all of it is going to have some wrinkles. So here is a collection of curiosities I’ve encountered in dealing with Unicode that you generally only find out about through experience. Enjoy.

(tags: unicode characters encoding emoji utf-8 utf-16 utf mysql text)

Links for 2015-09-09

Published September 9, 2015

httpbin(1): HTTP Client Testing Service

Testing an HTTP Library can become difficult sometimes. RequestBin is fantastic for testing POST requests, but doesn’t let you control the response. This exists to cover all kinds of HTTP scenarios. Additional endpoints are being considered.

(tags: http httpbin networking testing web coding hacks)

Links for 2015-09-08

Published September 8, 2015

The Pixel Factory

amazing slideshow/WebGL demo talking about graphics programming, its maths, and GPUs

(tags: maths graphics webgl demos coding algorithms slides tflops gpus)
‘I wish to register a complaint’: know your consumer rights before the fight

Conor Pope on the basics of consumer law — and how to complain — in Ireland

(tags: consumer ireland irish-times articles law)
Stormpot

an object pooling library for Java. Use it to recycle objects that are expensive to create. The library will take care of creating and destroying your objects in the background. Stormpot is very mature, is used in production, and has done over a trillion claim-release cycles in testing. It is faster and scales better than any competing pool.
Apache-licensed, and extremely fast: https://medium.com/@chrisvest/released-stormpot-2-4-eeab4aec86d0
(tags: java stormpot object-pooling object-pools pools allocation gc open-source apache performance)
Evolution of Babbel’s data pipeline on AWS: from SQS to Kinesis

Good “here’s how we found it” blog post:
Our new data pipeline with Kinesis in place allows us to plug new consumers without causing any damage to the current system, so it’s possible to rewrite all Queue Workers one by one and replace them with Kinesis Workers. In general, the transition to Kinesis was smooth and there were not so tricky parts. Another outcome was significantly reduced costs – handling almost the same amount of data as SQS, Kinesis appeared to be many times cheaper than SQS.

(tags: aws kinesis kafka streaming data-pipelines streams sqs queues architecture kcl)

Links for 2015-09-07

Published September 7, 2015

You’re probably wrong about caching

Excellent cut-out-and-keep guide to why you should add a caching layer. I’ve been following this practice for the past few years, after I realised that #6 (recovering from a failed cache is hard) is a killer — I’ve seen a few large-scale outages where a production system had gained enough scale that it required a cache to operate, and once that cache was damaged, bringing the system back online required a painful rewarming protocol. Better to design for the non-cached case if possible.

(tags: architecture caching coding design caches ops production scalability)
The Alternative Universe Of Soviet Arcade Games

Unlike machines in the West, every single machine that was produced during Soviet-era Russia had to align with Marxist ideology. […] The most popular games were created to teach hand-eye coordination, reaction speed, and logical, focused thinking. Not unlike many American games, these games were influenced by military training, crafted to teach and instill patriotism for the state by making the human body better, stronger, and more willful. It also means no high scores, no adrenaline rushes, or self-serving feather-fluffing as you add your hard-earned initials to the list of the best. In Communist Russia, there was no overt competition.

(tags: high-scores communism russia cccp ussr arcade-games games history)

Links for 2015-09-06

Published September 6, 2015

Large Java HashMap performance overview

Large HashMap overview: JDK, FastUtil, Goldman Sachs, HPPC, Koloboke, Trove – January 2015 version

(tags: java performance hashmap hashmaps optimization fastutil hppc jdk koloboke trove data-structures)

Links for 2015-09-04

Published September 4, 2015

what3emojis?

Is it too late to replace Eircode?
Addresses are hard. Who can remember street addresses or latitude/longitude pairs? You could do much better with three totally random English words, but then there’s that pesky language barrier. No system is perfect, except for emoji.

(tags: eircode maps parody via:nelson location geocoding mapping pile-of-poo)

Links for 2015-09-03

Published September 3, 2015

Real Time Analytics With Spark Streaming and Cassandra

…and Kafka

(tags: spark-streaming kafka analytics cassandra architecture data batch)
Improvements to Kafka integration of Spark Streaming

looks decent as an approach

(tags: kafka spark spark-streaming data)
Diffy: Testing services without writing tests

Play requests against 2 versions of a service. A fair bit more complex than simply replaying logged requests, which took 10 lines of a shell script last time I did it

(tags: http testing thrift automation twitter diffy diff soa tests)

Links for 2015-09-02

Published September 2, 2015

Gmail supports animated emoji in e-mail subjects

Currently only used in spam, naturally. (via Hilary Mason)

(tags: spam gmail animation gif base64 emojis goomojis)
Algorithmist

The Algorithmist is a resource dedicated to anything algorithms – from the practical realm, to the theoretical realm. There are also links and explanation to problemsets.
A wiki for algorithms. Not sure if this is likely to improve on Wikipedia, which of course covers the same subject matter quite well, though
(tags: algorithms reference wikis coding data-structures)
Spot Bid Advisor

analyzes Spot price history to help you determine a bid price that suits your needs.

(tags: ec2 aws spot spot-instances history)

Links for 2015-09-01

Published September 1, 2015

What Are the Worst Airports in the World?

this is a great resource when picking a stopover for a 2-stop flight. Pity “best kids play area” isn’t a criterion

(tags: airports comparison via:boingboing flying travel ranking world skytrax)
Using Samsung’s Internet-Enabled Refrigerator for Man-in-the-Middle Attacks

Whilst the fridge implements SSL, it FAILS to validate SSL certificates, thereby enabling man-in-the-middle attacks against most connections. This includes those made to Google’s servers to download Gmail calendar information for the on-screen display. So, MITM the victim’s fridge from next door, or on the road outside and you can potentially steal their Google credentials.
The Internet of Insecure Things strikes again.
(tags: iot security fridges samsung fail mitm ssl tls google papers defcon)
Malware infecting jailbroken iPhones stole 225,000 Apple account logins | Ars Technica

KeyRaider, as the malware family has been dubbed, is distributed through a third-party repository of Cydia, which markets itself as an alternative to Apple’s official App Store. Malicious code surreptitiously included with Cydia apps is creating problems for people in China and at least 17 other countries, including France, Russia, Japan, and the UK. Not only has it pilfered account data for 225,941 Apple accounts, it has also disabled some infected phones until users pay a ransom, and it has made unauthorized charges against some victims’ accounts.
Ouch. Not a good sign for Cydia
(tags: cydia apple security exploits jailbreaking ios iphone malware keyraider china)
GoTTY

‘a simple command line tool that turns your CLI tools into web applications’

(tags: cli terminal web tools unix)
S3QL

a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL effectively provides a hard disk of dynamic, infinite capacity that can be accessed from any computer with internet access running Linux, FreeBSD or OS-X. S3QL is a standard conforming, full featured UNIX file system that is conceptually indistinguishable from any local file system. Furthermore, S3QL has additional features like compression, encryption, data de-duplication, immutable trees and snapshotting which make it especially suitable for online backup and archival. S3QL is designed to favor simplicity and elegance over performance and feature-creep. Care has been taken to make the source code as readable and serviceable as possible. Solid error detection and error handling have been included from the very first line, and S3QL comes with extensive automated test cases for all its components.

(tags: filesystems aws s3 storage unix google-storage openstack)

Justin's Linklog Posts