Category: Uncategorized

Links for 2015-08-31

Published August 31, 2015

3 Lessons From The Amazon Takedown - Fortune

They are: The leaders we admire aren’t always that admirable; Economic performance and costs trump employee well-being; and people participate in and rationalize their own subjugation. 'In the end, “Amazonians” are not that different from other people in their psychological dynamics. Their company is just a more extreme case of what many other organizations regularly do. And most importantly, let’s locate the problem, if there is one, and its solution where it most appropriately belongs—not with a CEO who is greatly admired (and wealthy beyond measure) running a highly admired company, but with a society where money trumps human well-being and where any price, maybe even lives, is paid for status and success.' (via Lean)

(tags: amazon work work-life-balance life us fortune via:ldoody ceos employment happiness)
What does it take to make Google work at scale? [slides]

50-slide summary of Google's stack, compared vs Facebook, Yahoo!, and open-source-land, with the odd interesting architectural insight

(tags: google architecture slides scalability bigtable spanner facebook gfs storage)
Scaling Analytics at Amplitude

Good blog post on Amplitude's lambda architecture setup, based on S3 and a custom "real-time set database" they wrote themselves. antirez' comment from a Redis angle on the set database: http://antirez.com/news/92 HN thread: https://news.ycombinator.com/item?id=10118413

(tags: lambda-architecture analytics via:hn redis set-storage storage databases architecture s3 realtime)

Links for 2015-08-28

Published August 28, 2015

toxy

toxy is a fully programmatic and hackable HTTP proxy to simulate server failure scenarios and unexpected network conditions. It was mainly designed for fuzzing/evil testing purposes, when toxy becomes particularly useful to cover fault tolerance and resiliency capabilities of a system, especially in service-oriented architectures, where toxy may act as intermediate proxy among services. toxy allows you to plug in poisons, optionally filtered by rules, which essentially can intercept and alter the HTTP flow as you need, performing multiple evil actions in the middle of that process, such as limiting the bandwidth, delaying TCP packets, injecting network jitter latency or replying with a custom error or status code.

(tags: toxy proxies proxy http mitm node.js soa network failures latency slowdown jitter bandwidth tcp)
Drone Oversight Is Coming to Construction Sites

Grim Meathook Future

(tags: grim-meathook-future drones work panopticon future sacramento building-sites)
grsecurity

Open source security team has had enough of embedded-systems vendors taking the piss with licensing:
This announcement is our public statement that we've had enough. Companies in the embedded industry not playing by the same rules as every other company using our software violates users' rights, misleads users and developers, and harms our ability to continue our work. Though I've only gone into depth in this announcement on the latest trademark violation against us, our experience with two GPL violations over the previous year have caused an incredible amount of frustration. These concerns are echoed by the complaints of many others about the treatment of the GPL by the embedded Linux industry in particular over many years. With that in mind, today's announcement is concerned with the future availability of our stable series of patches. We decided that it is unfair to our sponsors that the above mentioned unlawful players can get away with their activity. Therefore, two weeks from now, we will cease the public dissemination of the stable series and will make it available to sponsors only. The test series, unfit in our view for production use, will however continue to be available to the public to avoid impact to the Gentoo Hardened and Arch Linux communities. If this does not resolve the issue, despite strong indications that it will have a large impact, we may need to resort to a policy similar to Red Hat's, described here or eventually stop the stable series entirely as it will be an unsustainable development model.

(tags: culture gpl linux opensource security grsecurity via:nelson gentoo arch-linux gnu)
London Calling: Two-Factor Authentication Phishing From Iran

some rather rudimentary anti-2FA attempts, presumably from Iranian security services

(tags: authentication phishing security iran activism 2fa mfa)
Vegemite May Power The Electronics Of The Future

Professor Marc in het Panhuis at the ARC Centre of Excellence for Electromaterials Science figured out that you can 3D print the paste and use it to carry current, effectively creating Vegemite bio-wires. What does this mean? Soon you can run electricity through your food. “The iconic Australian Vegemite is ideal for 3D printing edible electronics,” said the professor. “It contains water so it’s not a solid and can easily be extruded using a 3D printer. Also, it’s salty, so it conducts electricity.”
I'm sure the same applies for Marmite...

(tags: vegemite marmite 3d-printing electronics bread food silly)
Beoir.org Community - Recent Attack on McGargles

bizarre conspiracy theory going around about McGargles microbrewery being owned by Molson in an "astroturf craft beer" operation -- they apparently were set up by a bunch of ex-Molson employees. Their beer is getting stickered in off-licenses. Mental!

(tags: beer craft-beer ireland mcgargles conspiracy-theories bizarre beoir)

Links for 2015-08-27

Published August 27, 2015

Mining High-Speed Data Streams: The Hoeffding Tree Algorithm

This paper proposes a decision tree learner for data streams, the Hoeffding Tree algorithm, which comes with the guarantee that the learned decision tree is asymptotically nearly identical to that of a non-incremental learner using infinitely many examples. This work constitutes a significant step in developing methodology suitable for modern ‘big data’ challenges and has initiated a lot of follow-up research. The Hoeffding Tree algorithm has been covered in various textbooks and is available in several public domain tools, including the WEKA Data Mining platform.

(tags: hoeffding-tree algorithms data-structures streaming streams cep decision-trees ml learning papers)
Chinese scammers are now using Stingray tech to SMS-phish

A Stingray-style false GSM base station, hidden in a backpack; presumably they detect numbers in the vicinity, and SMS-spam those numbers with phishing messages. Reportedly the scammers used this trick in "Guangzhou, Zhuhai, Shenzhen, Changsha, Wuhan, Zhengzhou and other densely populated cities". Dodgy machine translation:
March 26, Zhengzhou police telecommunications fraud cases together, for the first time seized a small backpack can hide pseudo station equipment, and arrested two suspects. Yesterday, the police informed of this case, to remind the general public to pay attention to prevention. “I am the landlord, I changed number, please rent my wife hit the bank card, card number ×××, username ××.” Recently, Jiefang Road, Zhengzhou City Public Security Bureau police station received a number of cases for investigation brigade area of ??the masses police said, frequently received similar phone scam messages. Alarm, the police investigators to determine: the suspect may be in the vicinity of twenty-seven square, large-scale use of mobile pseudo-base release fraudulent information. [...] Yesterday afternoon, the Jiefang Road police station, the reporter saw the portable pseudo-base is made up of two batteries, a set-top box the size of the antenna box and a chassis, as well as a pocket computer composed together at most 5 kg.
(via t byfield and Danny O'Brien)

(tags: via:mala via:tbyfield privacy scams phishing sms gsm stingray base-stations mobile china)

Links for 2015-08-26

Published August 26, 2015

In search of performance - how we shaved 200ms off every POST request — GoCardless Blog

tl;dr: don't use Ruby's Net::HTTP and/or HAProxy prior to 1.4.19

(tags: http ruby tcp nagle performance rtt networking haproxy ack curl)

Links for 2015-08-25

Published August 25, 2015

Non-Celiac Gluten Sensitivity May Not Exist

The data clearly indicated that a nocebo effect, the same reaction that prompts some people to get sick from wind turbines and wireless internet, was at work here. Patients reported gastrointestinal distress without any apparent physical cause. Gluten wasn't the culprit; the cause was likely psychological. Participants expected the diets to make them sick, and so they did.

(tags: gluten placebo nocebo food science health diet gluten-free fodmaps)
Sorting out graph processing

Some nice real-world experimentation around large-scale data processing in differential dataflow:
If you wanted to do an iterative graph computation like PageRank, it would literally be faster to sort the edges from scratch each and every iteration, than to use unsorted edges. If you want to do graph computation, please sort your edges. Actually, you know what: if you want to do any big data computation, please sort your records. Stop talking sass about how Hadoop sorts things it doesn't need to, read some papers, run some tests, and then sort your damned data. Or at least run faster than me when I sort your data for you.

(tags: algorithms graphs coding data-processing big-data differential-dataflow radix-sort sorting x-stream counting-sort pagerank)
Docker image creation, tagging and traceability in Shippable

this is starting to look quite impressive as a well-integrated Docker-meets-CI model; Shippable is basing its builds off Docker baselines and is automatically cutting Docker images of the post-CI stage. Must take another look

(tags: shippable docker ci ops dev continuous-integration)

Links for 2015-08-24

Published August 24, 2015

Miller

'like sed, awk, cut, join, and sort for name-indexed data such as CSV'
Written in "modern C" with zero runtime dependencies. Looks great

(tags: cli csv unix miller tsv data tools)

Links for 2015-08-23

Published August 23, 2015

Analysis of PS4's security and the state of hacking

FreeBSD jails and Return-Oriented Programming:
Think of [Return-Oriented Programming] as writing a new chapter to a book, using only words that have appeared at the end of sentences in the previous chapters.

(tags: ps4 freebsd jails security exploits hacking sony rop return-oriented-programming)
10 Lesser-Known Cocktails You Should Be Drinking

like the sound of some of these

(tags: cocktails drinks recipes booze)
My wife found my email in the Ashley Madison database

On misdirected emails and the potential side-effects:
The reasons why these people give out my email instead of one that they can access have always been a bit mysterious to me. It’s one thing to save yourself some spam by using a throwaway address. But why use someone else’s for correspondence you actually want to receive? The closest I’ve come to a working theory is that a lot of them, having been slow off the mark to obtain their own gmail, have addresses like eratliff75@gmail.com. Either they believe they can leave off the numbers and receive the messages anyway, or they often simply forget. That or the E. Ratliffs of the world just view eratliff@gmail.com as some kind of shared resource.

(tags: email mail ashley-madison gmail mistakes misdirected-email)

Links for 2015-08-22

Published August 22, 2015

How to Make Raspberry-Thyme Shrub

looks tasty/non-tricky

(tags: shrubs raspberry thyme drinks vinegar nom)
How gaming terminology is part of modern mainstream Chinese slang

A few years ago, my mom called to ask for my advice on webcams. She explained (in the English-peppered Chinese that's the official language of our Chinese-American household) that some of her friends had started sharing videos of themselves singing karaoke. She thought she could do better. "?????PK??," she remarked: "I want to PK them a little."

(tags: china language gaming pk)

Links for 2015-08-21

Published August 21, 2015

sogeti-esec-lab/HomePlugPWN

Powerline networking is vulnerable to sniffing and brute-force attacks. See also http://www.nosuchcon.org/talks/2014/D1_03_Sebastien_Dudek_HomePlugAV_PLC.pdf

(tags: powerline-networking power networking han home exploits security qualcomm homeplug plcs)

Links for 2015-08-19

Published August 19, 2015

buildfarm_deployment/cleanup_docker_images.py

Cleanup old/obsolete Docker images in a repo.

(tags: disk-space ops docker cleanup cron)
Call me Maybe: Chronos

Chronos (the Mesos distributed scheduler) comes out looking pretty crappy here

(tags: aphyr mesos chronos cron scheduling outages ops jepsen testing partitions cap)
Kubernetes and AWS VPC Peering – Ben Straub

the perils of overloading 10/8

(tags: 10/8 ec2 aws vpc kubernetes ops internet ip-addresses)
How your entire financial life will be stored in a new 'digital vault' - Telegraph

In a move to make it easier to open bank accounts and Isas, people will be asked to share all of their accounts, tax records and personal details with a central service. To check someone's identity, a company would then ask potential customers a series of questions and check the answers against the information in the vault. The checks would replace the current system in which new customers must send by post copies of their passports, cross-signed by a friend, along with bank statements and utility bills.
hahahaha NO FUCKING WAY.

(tags: bills banking uk tax privacy digital-vault accounts authentication identity-theft bad-ideas)

Links for 2015-08-18

Published August 18, 2015

Someone discovered that the Facebook iOS application is composed of over 18,000 classes. : programming

_FBGraphQLConnectionStorePersistentPageLoaderOperationDelegate-Protocol.h _FBReactionAcornSportsContentSettingsSetShouldNotPushNotificationsMutationCall.h FBBoostedComponentCreateInputDataCreativeObjectStorySpecLinkDataCallToActionValue.h FBEventUpdateNotificationSubscriptionLevelMutationOptimisticPayloadFactoryProtocol-Protocol.h
I just threw up a little. See also https://www.facebook.com/notes/facebook-engineering/under-the-hood-dalvik-patch-for-facebook-for-android/10151345597798920 , in which the FB Android devs happily reveal that they hot-patch the Dalvik VM at runtime to work around a limit -- rather than refactoring their app.

(tags: facebook horrors coding ios android dalvik hot-patching apps)
Food Blogger Mehreen And Anges De Sucre's Patisserie Owner Reshmi Bennett In Online War Over #BloggerBlackmail

I can't believe this is the state of food blogging in the UK and Ireland. full-on payola for reviews. See also @damienmulley's excellent rant on the subject in this country: https://twitter.com/damienmulley/status/633353368757497858 -- there's even rate cards for positive review tweets/posts/facebook updates etc.

(tags: food blogging restaurants uk bakeries reviews payola blogger-blackmail pr)
The reusable holdout: Preserving validity in adaptive data analysis

Useful stats hack from Google: "We show how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses."

(tags: statistics google reusable-holdout training ml machine-learning data-analysis holdout corpus sampling)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)

Extremely authoritative slide deck on building a recommendation system, from Xavier Amatriain, Research/Engineering Manager at Netflix

(tags: netflix recommendations recommenders ml machine-learning cmu clustering algorithms)
rwasa

our full-featured, high performance, scalable web server designed to compete with the likes of nginx. It has been built from the ground-up with no external library dependencies entirely in x86_64 assembly language, and is the result of many years' experience with high volume web environments. In addition to all of the common things you'd expect a modern web server to do, we also include assembly language function hooks ready-made to facilitate Rapid Web Application Server (in Assembler) development.

(tags: assembly http performance https ssl x86_64 web ops rwasa tls)

Links for 2015-08-17

Published August 17, 2015

The world beyond batch: Streaming 101 - O'Reilly Media

To summarize, in this post I’ve: Clarified terminology, specifically narrowing the definition of “streaming” to apply to execution engines only, while using more descriptive terms like unbounded data and approximate/speculative results for distinct concepts often categorized under the “streaming” umbrella. Assessed the relative capabilities of well-designed batch and streaming systems, positing that streaming is in fact a strict superset of batch, and that notions like the Lambda Architecture, which are predicated on streaming being inferior to batch, are destined for retirement as streaming systems mature. Proposed two high-level concepts necessary for streaming systems to both catch up to and ultimately surpass batch, those being correctness and tools for reasoning about time, respectively. Established the important differences between event time and processing time, characterized the difficulties those differences impose when analyzing data in the context of when they occurred, and proposed a shift in approach away from notions of completeness and toward simply adapting to changes in data over time. Looked at the major data processing approaches in common use today for bounded and unbounded data, via both batch and streaming engines, roughly categorizing the unbounded approaches into: time-agnostic, approximation, windowing by processing time, and windowing by event time.

(tags: streaming batch big-data lambda-architecture dataflow event-processing cep millwheel data data-processing)
What the hell is going on with SoundCloud?

tl;dr: major labels.
Despite having revenue coming in from ads and subscriptions, SoundCloud still relies on outside investment. While the company received $150 million in a funding round at the end of last year, it pales next to the reported $526 million Spotify gained in June, and if one report is to be believed, SoundCloud is running very low on cash. Furthermore, sources suggest that potential investors are waiting to see what happens with Sony and Universal before ploughing in more money. With the high sums reported to be involved, it’s a stalemate that could potentially break the company whether it decides to pay or not.

(tags: soundcloud music mp3 copyright sony universal spotify funding startups)
GSMem: Data Exfiltration from Air-Gapped Computers over GSM Frequencies

Holy shit.
Air-gapped networks are isolated, separated both logically and physically from public networks. Although the feasibility of invading such systems has been demonstrated in recent years, exfiltration of data from air-gapped networks is still a challenging task. In this paper we present GSMem, a malware that can exfiltrate data through an air-gap over cellular frequencies. Rogue software on an infected target computer modulates and transmits electromagnetic signals at cellular frequencies by invoking specific memory-related instructions and utilizing the multichannel memory architecture to amplify the transmission. Furthermore, we show that the transmitted signals can be received and demodulated by a rootkit placed in the baseband firmware of a nearby cellular phone.

(tags: gsmem gsm exfiltration air-gaps memory radio mobile-phones security papers)

Links for 2015-08-16

Published August 16, 2015

An Amazonian’s response to “Inside Amazon: Wrestling Big Ideas in a Bruising Workplace” — Medium

excellent response to the NYT hatchet job

(tags: amazon work life nytimes journalism workplace)
Sweary Australian Mountains

This is great. Featuring Mount Buggery:
There were no tracks of any sort until they reached Mt Howitt and Stewart, perhaps not quite as fit as he could have been, was finding the going tough after the descent from Mt Speculation. Faced with the prospect of yet another laborious climb he exploded with the words 'What another bugger! I'll call this mountain Mt Buggery.'
and Mount Arsehole:
"We always called it Mt Arsehole... Then they came along with all their fancy bloody maps and ideas. Changed it to Mt Arthur. Christ knows why. Bastard of a place anyway!"

(tags: swearing australia mount-buggery mount-arsehole nsw victoria places history names mountains)
minimaxir/big-list-of-naughty-strings

Late to this one -- a nice list of bad input (Unicode zero-width spaces, etc) for testing

(tags: testing strings text data unicode utf-8 tests input corrupt)

Links for 2015-08-14

Published August 14, 2015

Preventing Dependency Chain Attacks in Maven

using a whitelist of allowed dependency JARs and their SHAs

(tags: security whitelisting dependencies coding jar maven java jvm)

Links for 2015-08-12

Published August 12, 2015

The Travis CI Blog: Making Travis CI a Family-Friendly Place to Work: Our Maternity and Paternity Leave for US Employees

This is excellent -- I wish more companies took this attitude. Applause for Travis CI.
after a couple of weeks of research, we made a decision to offer our expectant mothers AND fathers: 2 weeks before the due date paid at 100% (optional, but recommended); 20 weeks for normal births paid at 100%; 24 weeks for births with complications paid at 100%; Flexible working hours after the 20/24 weeks are complete (part-time arrangements can be made); Your job will be here for you when you return. When we relayed this information to the two US employees, one became a little teary because her last employer (a much bigger and older company), didn't offer anything. This being her second child, it was a huge relief to know she was going to have paid time off with flexibility upon return. While it was a great reaction, it shouldn't happen this way. If you value your employees, you should value their need for time away. At the same time, if you want to hire someone, whether or not they are already pregnant should be irrelevant.
Well exceeding even the Irish maternity leave entitlements, since it covers fathers too. And this is a startup!

(tags: travisci startups work life family kids paternity-leave maternity-leave)
Improving The Weather On Twitter

lovely open-source dataviz improvement for near-term historical rainfall-radar images

(tags: dataviz weather rain rainfall radar nws twitter bots graphics ui)
Somewhere Over the Rainbow: How to Make Effective Use of Colors in Meteorological Visualizations

Linked from the "Improving the Weather On Twitter" post -- choosing the "best" colour scheme for meteorological visualization. Great dataviz resource post

(tags: dataviz colour color meteorological weather nws papers rgb hcl)

Links for 2015-08-11

Published August 11, 2015

Reddit comments from a nuclear-power expert

Reddit user "Hiddencamper" is a senior nuclear reactor operator in the US, and regularly posts very knowledgeable comments about reactor operations, safety procedures, and other details. It's fascinating (via Maciej)

(tags: via:maciej nuclear-power nuclear atomic power energy safety procedures operations history chernobyl scram)
Amazon EC2 2015 Benchmark: Testing Speeds Between AWS EC2 and S3 Regions

Here we are again, a year later, and still no bloody percentiles! Just amateurish averaging. This is not how you measure anything, ffs. Still, better than nothing I suppose

(tags: fail latency measurement aws ec2 percentiles s3)
background doc on the Jeep hack

"Remote Exploitation of an Unaltered Passenger Vehicle", by Dr. Charlie Miller (cmiller@openrce.org) and Chris Valasek (cvalasek@gmail.com). QNX, unauthenticated D-Bus, etc.
'Since a vehicle can scan for other vulnerable vehicles and the exploit doesn’t require any user interaction, it would be possible to write a worm. This worm would scan for vulnerable vehicles, exploit them with their payload which would scan for other vulnerable vehicles, etc. This is really interesting and scary. Please don’t do this. Please.'

(tags: jeep hacks exploits d-bus qnx cars safety risks)
Care.data and access to UK health records: patient privacy and public trust

'In 2013, the United Kingdom launched care.data, an NHS England initiative to combine patient records, stored in the machines of general practitioners (GPs), with information from social services and hospitals to make one centralized data archive. One aim of the initiative is to gain a picture of the care being delivered between different parts of the healthcare system and thus identify what is working in health care delivery, and what areas need greater attention and resources. This case study analyzes the complications around the launch of care.data. It explains the historical context of the program and the controversies that emerged in the course of the rollout. It explores problems in management and communications around the centralization effort, competing views on the safety of “anonymous” and “pseudonymous” health data, and the conflicting legal duties imposed on GPs with the introduction of the 2012 Health and Social Care Act. This paper also explores the power struggles in the battle over care.data and outlines the tensions among various stakeholders, including patients, GPs, the Health and Social Care Information Centre (HSCIC), the government, privacy experts and data purchasers. The predominant public policy question that emerges from this review centers on how best to utilize technological advances and simultaneously strike a balance between the many competing interests around health and personal privacy.'

(tags: care.data privacy healthcare uk nhs trust anonymity anonymization gps medicine)

Links for 2015-08-10

Published August 10, 2015

"Hate-Selling"

coining a term for the awful buyer's experience on sites like car-hire or air-travel websites

(tags: hate-selling conversion marking upselling travel web consumer)
How Irish Navy’s expertise saved 367 from 30-second sinking in Mediterranean

War-game exercises saved the day:
As the Ribs made their assessment of the situation and began reassuring those on board that help was at hand, the hopelessly overloaded vessel suddenly listed and sank. The sinking took just over 30 seconds. In those 30 seconds, the Captain of the LE Niamh took a number of instant command decisions that saved hundreds of lives. Most of the refugees cannot swim. Their life expectancy in the water would be measured in seconds. The crew of the Ribs immediately began throwing orange lifejackets into the water – encouraging the now frenzied and milling survivors to cling to them. Individuals, then groups clung to the lifejackets – and one another – as the Ribs rallied around trying to keep the floating human mass from dispersal into wider waters and almost certain death. In the meantime, the commander of the LE Niamh managed to manoeuvre close in to the survivors where spare life-rafts were launched into the water. These 25-man inflatable life-rafts were specifically ordered and kept on board the LE Niamh following a “war-gaming” exercise, where the officers and crew envisaged such a nightmare scenario. Had this forward planning not taken place – there would have been no such extra inflatable lifeboats on board.

(tags: war-gaming planning navy ireland mediterranean sea boats refugees migration drowning liferafts)
A collection of postmortems

A well-maintained list with a potted description of each one (via HN)

(tags: postmortems ops uptime reliability)
Advantages of Monolithic Version Control

another Dan Luu post -- good summary of the monorepo's upside

(tags: monorepo git mercurial versioning source-control coding dependencies)
"A Review Of Criticality Accidents, 2000 Revision"

Authoritative report from LANL on accidents involving runaway nuclear reactions over the years from 1945 to 1999, around the world. Illuminating example of how incident post-mortems are handled in other industries, and (of course) fascinating in its own right

(tags: criticality nuclear safety atomic lanl post-mortems postmortems fission)

Links for 2015-08-09

Published August 9, 2015

This Is My Jam shutting down

but, crucially, with an Andy-Baio-approved archival process. Nicely done -- this is a good example of how to do it

(tags: api archiving music mp3 this-is-my-jam archival shutdown)

Links for 2015-08-07

Published August 7, 2015

The Netflix Test Video

Netflix' official test video -- contains various scenarios which exercise frequent tricky edge cases in video compression and playback; A/V sync, shades of black, running water, etc.

(tags: networking netflix streaming video compression tests)
How to get your water tested for lead in Dublin

Ossian has written up this very informative post:
Irish Water is writing to thousands of people living in Dublin this week to warn them that their water is supplied through lead pipes. Irish Water says that most people receiving these letters have a level of lead in their water which is above safe limits. So, if you get one of these letters how do you get your water tested? Irish Water is refusing to supply test kits or to test everyone’s water who asks. However the HSE’s Public Analyst Lab has told me that they will test water for lead for a fee of €10.

(tags: ossian-smyth dun-laoghaire dublin drinking-water water lead green hse irish-water health)
Implementing Efficient and Reliable Producers with the Amazon Kinesis Producer Library - AWS Big Data Blog

Good advice on production-quality, decent-scale usage of Kinesis in Java with the official library: batching, retries, partial failures, backoff, and monitoring. (Also, jaysus, the AWS Cloudwatch API is awful, looking at this!)

(tags: kpl aws kinesis tips java batching streaming production cloudwatch monitoring coding)
IrishCycle.com on the Irish Times' terrible victim-blaming anti-cycling op-ed

Even if The Irish Times wants to deny that it has engaged in victim blaming at a high level, it has also clearly errored in fact in a very significant way. It would be more forgiving if this was an isolated editorial. But it’s after two days of wrong or misleading coverage, which now seems to be a trend with the newspaper with unbalanced articles or headlines negatively focusing on cycle routes.

(tags: irish-times newspapers op-eds cycling dublin ireland safety)
Ironman 70.3 Road Closures

plenty of stuff out of bounds in Dublin tomoz

(tags: dublin races ironman roads traffic)

Links for 2015-08-06

Published August 6, 2015

India lifts porn ban after widespread outrage - BBC News

After a brief couple of days.
News of the ban caused a furore on Indian social media, with several senior politicians and members of civil society expressing their opposition to the move. The Indian government said that it was merely complying with the Supreme Court order and was committed to the freedom of communication on the Internet. "I reject with contempt the charge that it is a Talibani government, as being said by some of the critics. Our government supports free media, respects communication on social media and has respected freedom of communication always," Mr Prasad told PTI.

(tags: india porn filtering isps internet web child-porn censorship)
17 of the most important things to ever happen to Irish Twitter

definitive. The David O'Doherty / Not The RTE Guide "your ma" battle is legendary (http://thedailyedge.thejournal.ie/your-ma-david-odoherty-1290482-Jan2014/)

(tags: ireland twitter funny social-media)

Links for 2015-08-05

Published August 5, 2015

Amazon S3 Introduces New Usability Enhancements

bucket limit increase, and read-after-write consistency in US Standard. About time too! ;)

(tags: aws s3 storage consistency)
New study shows Spain’s “Google tax” has been a disaster for publishers

A study commissioned by Spanish publishers has found that a new intellectual property law passed in Spain last year, which charges news aggregators like Google for showing snippets and linking to news stories, has done substantial damage to the Spanish news industry. In the short-term, the study found, the law will cost publishers €10 million, or about $10.9 million, which would fall disproportionately on smaller publishers. Consumers would experience a smaller variety of content, and the law "impedes the ability of innovation to enter the market." The study concludes that there's no "theoretical or empirical justification" for the fee.

(tags: google news publishing google-tax spain law aggregation snippets economics)
Inside the sad, expensive failure of Google+

"It was clear if you looked at the per user metrics, people weren’t posting, weren't returning and weren’t really engaging with the product," says one former employee. "Six months in, there started to be a feeling that this isn’t really working." Some lay the blame on the top-down structure of the Google+ department and a leadership team that viewed success as the only option for the social network. Failures and disappointing data were not widely discussed. "The belief was that we were always just one weird feature away from the thing taking off," says the same employee.

(tags: google google+ failures post-mortems business facebook social-media fail bureaucracy vic-gundotra)
8,000 sq ft start-up meeting space revealed for Dublin

Neat. this is a good location for post-work user-group meetups and the like (via Oisin)

(tags: via:oisin meetups meetings ulster-bank dublin startups chq)

Links for 2015-08-04

Published August 4, 2015

Introducing Nurse: Auto-Remediation at LinkedIn

Interesting to hear about auto-remediation in prod -- we built a (very targeted) auto-remediation system in Amazon on the Network Monitoring team, but this is much bigger in focus

(tags: nurse auto-remediation outages linkedin ops monitoring)

Links for 2015-08-03

Published August 3, 2015

choco-solver.org

Choco is [FOSS] dedicated to Constraint Programming[2]. It is a Java library written under BSD license. It aims at describing hard combinatorial problems in the form of Constraint Satisfaction Problems and solving them with Constraint Programming techniques. The user models its problem in a declarative way by stating the set of constraints that need to be satisfied in every solution. Then, Choco solves the problem by alternating constraint filtering algorithms with a search mechanism. [...] Choco is among the fastest CP solvers on the market. In 2013 and 2014, Choco has been awarded many medals at the MiniZinc challenge that is the world-wide competition of constraint-programming solvers.

(tags: choco constraint-programming solving search combinatorial algorithms)
Three Flavours Cornetto trilogy

Shaun Of The Dead, Hot Fuzz, and The World's End are a trilogy. I had no idea! (via David Malone)

(tags: movies edgar-wright via:dwmalone funny film cornetto)
Postmortem for July 27 outage of the Manta service - Blog - Joyent

Summary: PostgreSQL's dreaded unpredictable "vacuum" GC

(tags: postgres outages joyent manta ops)

Links for 2015-07-30

Published July 30, 2015

danilop/yas3fs · GitHub

YAS3FS (Yet Another S3-backed File System) is a Filesystem in Userspace (FUSE) interface to Amazon S3. It was inspired by s3fs but rewritten from scratch to implement a distributed cache synchronized by Amazon SNS notifications. A web console is provided to easily monitor the nodes of a cluster.

(tags: aws s3 s3fs yas3fs filesystems fuse sns)
danilop/runjop · GitHub

RunJOP (Run Just Once Please) is a distributed execution framework to run a command (i.e. a job) only once in a group of servers [built using AWS DynamoDB and S3].
nifty! Distributed cron is pretty easy when you've got Dynamo doing the heavy lifting.

(tags: dynamodb cron distributed-cron scheduling runjop danilop hacks aws ops)

Links for 2015-07-29

Published July 29, 2015

Why Docker is Not Yet Succeeding Widely in Production

Spot-on points which Docker needs to address. It's still production-ready, and _should_ be used there, it just has significant rough edges...

(tags: docker containers devops deployment releases linux ops)
How to Create RSS Feeds for Twitter

The latest hacky workaround to Twitter's API shortcoming

(tags: rss-feeds feeds twitter favorites api social-media workaround google-script)
Testing without mocking in Scala

mocks are the sound of your code crying out, "please structure me differently!"
+1

(tags: scala via:jessitron mocks mock-objects testing testability coding)
Newegg vs. Patent Trolls: When We Win, You Win

go NewEgg: 'Newegg went against a company that claimed its patent covered SSL and RC4 encryption, a common encryption system used by many retailers and websites. This particular patent troll has gone against over 100 other companies, and brought in $45 million in settlements before going after Newegg. We won.'

(tags: via:nelson ip law patent-trolls patents newegg crypto)
Festina Lente

A lovely eulogy for Nóirín Plunkett, from Rich Bowen. RIP Nóirín :(

(tags: noirin-plunkett memorials eulogies rip asf apache)
A Visual Introduction to Machine Learning

beautiful visualisation of a decision tree

(tags: decision-trees dataviz via:nelson d3 ml machine-learning)

Links for 2015-07-28

Published July 28, 2015

Taming Complexity with Reversibility

This is a great post from Kent Beck, putting a lot of recent deployment/rollout patterns in a clear context -- that of supporting "reversibility":
Development servers. Each engineer has their own copy of the entire site. Engineers can make a change, see the consequences, and reverse the change in seconds without affecting anyone else. Code review. Engineers can propose a change, get feedback, and improve or abandon it in minutes or hours, all before affecting any people using Facebook. Internal usage. Engineers can make a change, get feedback from thousands of employees using the change, and roll it back in an hour. Staged rollout. We can begin deploying a change to a billion people and, if the metrics tank, take it back before problems affect most people using Facebook. Dynamic configuration. If an engineer has planned for it in the code, we can turn off an offending feature in production in seconds. Alternatively, we can dial features up and down in tiny increments (i.e. only 0.1% of people see the feature) to discover and avoid non-linear effects. Correlation. Our correlation tools let us easily see the unexpected consequences of features so we know to turn them off even when those consequences aren't obvious. IRC. We can roll out features potentially affecting our ability to communicate internally via Facebook because we have uncorrelated communication channels like IRC and phones. Right hand side units. We can add a little bit of functionality to the website and turn it on and off in seconds, all without interfering with people's primary interaction with NewsFeed. Shadow production. We can experiment with new services under real load, from a tiny trickle to the whole flood, without affecting production. Frequent pushes. Reversing some changes require a code change. On the website we never more than eight hours from the next schedule code push (minutes if a fix is urgent and you are willing to compensate Release Engineering). The time frame for code reversibility on the mobile applications is longer, but the downward trend is clear from six weeks to four to (currently) two. Data-informed decisions. (Thanks to Dave Cleal) Data-informed decisions are inherently reversible (with the exceptions noted below). "We expect this feature to affect this metric. If it doesn't, it's gone." Advance countries. We can roll a feature out to a whole country, generate accurate feedback, and roll it back without affecting most of the people using Facebook. Soft launches. When we roll out a feature or application with a minimum of fanfare it can be pulled back with a minimum of public attention. Double write/bulk migrate/double read. Even as fundamental a decision as storage format is reversible if we follow this format: start writing all new data to the new data store, migrate all the old data, then start reading from the new data store in parallel with the old.
We do a bunch of these in work, and the rest are on the to-do list. +1 to these!

(tags: software deployment complexity systems facebook reversibility dark-releases releases ops cd migration)

Links for 2015-07-27

Published July 27, 2015

Benchmarking GitHub Enterprise - GitHub Engineering

Walkthrough of debugging connection timeouts in a load test. Nice graphs (using matplotlib)

(tags: github listen-backlog tcp debugging timeouts load-testing benchmarking testing ops linux)
How .uk came to be (and why it's not .gb)

WB: By the late 80s the IANA [the Internet Assigned Numbers Authority, set up in 1988 to manage global IP address allocations] was trying to get all those countries that were trying to join the internet to use the ISO 3166 standard for country codes. It was used for all sorts of things?—?you see it on cars, “GB” for the UK. [...] At that point, we’re faced with a problem that Jon Postel would like to have changed it to .gb to be consistent with the rest of the world. Whereas .uk had already been established, with a few tens of thousands of domain names with .uk on them. I remember chairing one of the JANET net workshops that were held every year, and the Northern Irish were adamant that they were part of the UK?—?so the consensus was, we’d try and keep .uk, we’d park .gb and not use it. PK: I didn’t particularly want to change to .gb because I was responsible for Northern Ireland as well. And what’s more, there was a certain question as to whether a research group in the US should be allowed to tell the British what to do. So this argy-bargy continued for a little while and, in the meantime, one of my clients was the Ministry of Defence, and they decided they couldn’t wait this long, and they decided I was going to lose the battle, and so bits of MOD went over to .gb?—?I didn’t care, as I was running .gb and .uk in any case.

(tags: dot-uk history internet dot-gb britain uk northern-ireland ireland janet)
That time the Internet sent a SWAT team to my mom's house - Boing Boing

The solution is for social media sites and the police to take threats or jokes about swatting, doxxing, and organized crime seriously. Tweeting about buying a gun and shooting up a school would be taken seriously, and so should the threat of raping, doxxing, swatting or killing someone. Privacy issues and online harassment are directly linked, and online harassment isn’t going anywhere. My fear is that, in reaction to online harassment, laws will be passed that will break down our civil freedoms and rights online, and that more surveillance will be sold to users under the guise of safety. More surveillance, however, would not have helped me or my mother. A platform that takes harassment and threats seriously instead of treating them like jokes would have.

(tags: twitter gamergate 4chan 8chan privacy doxxing swatting harrassment threats social-media facebook law feminism)
Why Google's Deep Dream Is Future Kitsch

Deep Dream estranges us from our fears, perhaps, but it doesn't make them go away. It's easy to discuss Deep Dream as an independent creature, a foreign intelligence that we interact with for fun. Yet like all kitsch, it comes straight back to its creators.

(tags: kitsch deep-dream art graphics google inceptionism)
It’s Not Climate Change?—?It’s Everything Change

now this is a Long Read. the inimitable Margaret Atwood on climate change, beautifully illustrated

(tags: climate climate-change margaret-atwood long-reads change life earth green future)
In Praise of the AK-47 — Dear Design Student — Medium

While someone can certainly make the case that an AK-47, or any other kind of gun or rifle is designed, nothing whose primary purpose is to take away life can be said to be designed well. And that attempting to separate an object from its function in order to appreciate it for purely aesthetic reasons, or to be impressed by its minimal elegance, is a coward’s way of justifying the death they’ve designed into the word, and the money with which they’re lining their pockets.

(tags: design ux ak-47 kalashnikov guns function work)

Links for 2015-07-22

Published July 22, 2015

A Tour Through Random Ruby

turns out Ruby has a good set of random-text-generation gems on offer

(tags: random ruby coding text-generation markov-chain gems)
The Titanium Gambit | History | Air & Space Magazine

Amazing story of 1960s detente via Maciej: 'During the Cold War, Boeing execs got a strange call from the State Department: Would you guys mind trading secrets with the Russians?'

(tags: via:maciej titanium history cold-war detente ussr usa boeing russia aerospace)
I’ve seen more than my fair share of abuse online, but Lorraine Higgins’ bill isn’t the answer

Tom Murphy:
This bill prioritises other peoples’ “alarm or distress” over your communications not just TO them but also ABOUT them. Don’t like what Joan Burton is doing with the water charges? Want to write something on independent media about what you think of that? Better not alarm or distress or harm her! This is the core of my issue with the bill. It’s not just that almost all the agreeable parts of it are already covered by other laws. It’s not just that it’s utterly unenforceable with our current justice system. It’s not just that it’s so vague and fluffy. It’s that it’s so ill-defined and over-reaching that its interpretation will inevitably have to be left to judges. Leaving anything to judges is a bad idea in general. This overly broad and poorly worded bill is a god-send to people who like to bully others into silence. Ironic that eh?!

(tags: lorraine-higgins law seanad abuse harrassment trolls)

Links for 2015-07-21

Published July 21, 2015

Java lambdas and performance

Lambdas in Java 8 introduce some unpredictable performance implications, due to reliance on escape analysis to eliminate object allocation on every lambda invocation. Peter Lawrey has some details

(tags: lambdas java-8 java performance low-latency optimization peter-lawrey coding escape-analysis)
Mikhail Panchenko's thoughts on the July 2015 CircleCI outage

an excellent followup operational post on CircleCI's "database is not a queue" outage

(tags: database-is-not-a-queue mysql sql databases ops outages postmortems)
Men who harass women online are quite literally losers, new study finds

(1) players are anonymous, and the possibility of “policing individual behavior is almost impossible”; (2) they only encounter each other a few times in passing — it’s very possible to hurl an expletive at another player, and never “see” him or her again; and (3) finally, and perhaps predictably, the sex-ratio of players is biased pretty heavily toward men. (A 2014 survey of gender ratios on Reddit found that r/halo was over 95 percent male.) [....] In each of these environments, Kasumovic suggests, a recent influx of female participants has disrupted a pre-existing social hierarchy. That’s okay for the guys at the top — but for the guys at the bottom, who stand to lose more status, that’s very threatening. (It’s also in keeping with the evolutionary framework on anti-lady hostility, which suggests sexism is a kind of Neanderthal defense mechanism for low-status, non-dominant men trying to maintain a shaky grip on their particular cave’s supply of women.) “As men often rely on aggression to maintain their dominant social status,” Kasumovic writes, “the increase in hostility towards a woman by lower-status males may be an attempt to disregard a female’s performance and suppress her disturbance on the hierarchy to retain their social rank.”

(tags: losers sexism mysogyny women halo gaming gamergate 4chan abuse harrassment papers bullying social-status)
The old suburban office park is the new American ghost town - The Washington Post

Most analyses of the market indicate that office parks simply aren’t as appealing or profitable as they were in the 20th century and that Americans just aren’t as keen to cloister themselves in workspaces that are reachable only by car.

(tags: cbd cities work life office-parks commuting america history workplaces)
HACKERS REMOTELY KILL A JEEP ON THE HIGHWAY—WITH ME IN IT

Jaysus, this is terrifying.
Miller and Valasek’s full arsenal includes functions that at lower speeds fully kill the engine, abruptly engage the brakes, or disable them altogether. The most disturbing maneuver came when they cut the Jeep’s brakes, leaving me frantically pumping the pedal as the 2-ton SUV slid uncontrollably into a ditch.
Avoid any car which supports this staggeringly-badly-conceived Uconnect feature:
All of this is possible only because Chrysler, like practically all carmakers, is doing its best to turn the modern automobile into a smartphone. Uconnect, an Internet-connected computer feature in hundreds of thousands of Fiat Chrysler cars, SUVs, and trucks, controls the vehicle’s entertainment and navigation, enables phone calls, and even offers a Wi-Fi hot spot.
:facepalm: Also, Chrysler's response sucks: "Chrysler’s patch must be manually implemented via a USB stick or by a dealership mechanic."

(tags: hacking security cars driving safety brakes jeeps chrysler fiat uconnect can-bus can)

Links for 2015-07-20

Published July 20, 2015

RFC 3339: Date and Time on the Internet: Timestamps

the RFC take on ISO-8601. I need to update my mental bookmarks to start referring to this instead

(tags: iso8601 dates times datetimes rfc standards)
"Customer data is a liability, not an asset."

Great turn of phrase from Matthew Green (@matthew_d_green). Emin Gün Sirer adds some detail: "well, an asset with bounded value, and an unbounded liability"

(tags: data privacy data-protection ashleymadison hacks security liability)
Deep Dive Into Docker Storage Drivers

good detail in this presentation

(tags: docker overlayfs aufs btrfs filesystems ops linux containers)
Google Flights

oh look, Google has a flight search engine! I had no idea

(tags: google flights travel search holidays)
One Direction Offers Remix Competition, Then Sony/Soundcloud Punish The Entrants As Copyright Infringers | Techdirt

TorrentFreak has the story of a UK-producer and songwriter named Lee Adams who took part in an official remix competition of boy band One Direction's music, put on by the band and its label, Sony Music. The stems for remixing were released on Soundcloud. The rules of the contest required entrants to upload their remixes on Soundcloud... and that's exactly what Adams did. And yet those works still got taken down via copyright claims from Sony Music as infringing.

(tags: sony soundcloud anti-piracy automation piracy stems remixing one-direction lee-adams)

Links for 2015-07-19

Published July 19, 2015

WereBank | Were Bank Energy for the People

The Freeman-On-The-Land movement is starting a bank. lols guaranteed

(tags: freemen funny werebank banking money on-my-oath maritime-law)

Links for 2015-07-17

Published July 17, 2015

Angela Merkel told a sobbing girl she couldn't save her from deportation. It was a lie. - Vox

Argentina has, as a matter of constitutional law, effectively open borders. There are no caps or quotas or lottery systems. You can move there legally if you have an employer or family member to sponsor you. That's all you need. If you don't have a sponsor, and make your way in illegally, you're recognized as an "irregular migrant." Discrimination against irregular migrants in health care or education is illegal, and deportation in noncriminal cases is exceptionally rare. Large-scale amnesties are the norm. Obviously Argentina is not nearly as rich as Germany or the US or the UK. But it's considerably richer than three of its neighbors (Bolivia, Paraguay, and Brazil). And yet it doesn't try hard to keep their residents out. It welcomes them — as it should. "One could have expected catastrophe—an uncontrollable flow of poorer immigrants streaming into the country coupled with angry public backlash," Elizabeth Slater writes in the World Policy Journal. "That hasn't happened." Angela Merkel clearly expects catastrophe if she lets people like this weeping young Palestinian girl stay in Germany. That catastrophe is simply a myth; it wouldn't happen. What would happen is that Germany's economy would grow, its culture would grow richer, and that girl and more like her could see their lives improve immeasurably.

(tags: argentina immigration angela-merkel germany eu migrants deportation economics)

Links for 2015-07-16

Published July 16, 2015

ArnoldC

'A programming language based on the one liners of Arnold Schwarzenegger'. Presenting hello.arnoldc: IT'S SHOWTIME TALK TO THE HAND "hello world" YOU HAVE BEEN TERMINATED (via Robert Walsh)

(tags: via:rjwalsh c arnold-schwarzenegger one-liners funny coding silly languages)
A simple guide to 9-patch for Android UI

This is a nifty hack. TIL! '9-patch uses png transparency to do an advanced form of 9-slice or scale9. The guides are straight, 1-pixel black lines drawn on the edge of your image that define the scaling and fill of your image. By naming your image file name.9.png, Android will recognize the 9.png format and use the black guides to scale and fill your bitmaps.'

(tags: android design 9-patch scaling images bitmaps scale9 9-slice ui graphics)

Links for 2015-07-15

Published July 15, 2015

Government forum to discuss increasing use of personal data

Mr Murphy said it was the Government’s objective for Ireland to be a leader on data protection and data-related issues. The members of the forum include Data Protection Commissioner Helen Dixon, John Barron, chief technology officer with the Revenue Commissioners, Seamus Carroll, head of civil law reform division at the Department of Justice and Tim Duggan, assistant secretary with the Department of Social Protection. Gary Davis, director of privacy and law enforcement requests with Apple, is also on the forum. Mr Davis is a former deputy data protection commissioner in Ireland. There are also representatives from Google, Twitter, LinkedIn and Facebook, from the IDA, the Law Society and the National Statistics Board. Chair of Digital Rights Ireland Dr TJ McIntyre and Dr Eoin O’Dell, associate professor, School of Law, Trinity College Dublin are also on the voluntary forum.

(tags: ireland government dri law privacy data data-protection dpc)
How to monitor NGINX

From DataDog. See also "How to collect NGINX metrics": https://www.datadoghq.com/blog/how-to-collect-nginx-metrics/

(tags: nginx monitoring metrics howto datadog ops)
From Zero to Docker: Migrating to the Whale

nicely detailed writeup of how New Relic are dockerizing

(tags: docker ops deployment packaging new-relic)
Docker with OverlayFS first impressions

a brief howto

(tags: overlayfs docker filesystems ops linux)
"last seen" sketch

a new sketch algorithm from Baron Schwartz and Preetam Jinka of VividCortex; similar to Count-Min but with last-seen timestamp instead of frequency.

(tags: sketch algorithms estimation approximation sampling streams big-data)

Links for 2015-07-14

Published July 14, 2015

Code-Point Open

The UK Ordnance Survey's "open data' free product, free for all uses:
Code-Point Open is FREE to view, download and use for commercial, educational and personal purposes.
(via Antoin)

(tags: via:antoin postcodes mapping open-data ordnance-survey uk gb royal-mail maps)
Apple now biases towards IPv6 with a 25ms delay on connections

Interestingly, they claim that IPv6 tends to be more reliable and has lower latency now:
Based on our testing, this makes our Happy Eyeballs implementation go from roughly 50/50 IPv4/IPv6 in iOS 8 and Yosemite to ~99% IPv6 in iOS 9 and El Capitan betas. While our previous implementation from four years ago was designed to select the connection with lowest latency no matter what, we agree that the Internet has changed since then and reports indicate that biasing towards IPv6 is now beneficial for our customers: IPv6 is now mainstream instead of being an exception, there are less broken IPv6 tunnels, IPv4 carrier-grade NATs are increasing in numbers, and throughput may even be better on average over IPv6.

(tags: apple ipv6 ip tcp networking internet happy-eyeballs ios osx)
Eircode - The Alternatives

lest we forget -- this is a 2014-era writeup of OpenPostcode (open), Loc8 and GoCode (proprietary) as alternative options to the Eircode system

(tags: eircode openpostcode loc8 gocode ireland geocoding mapping location history open-data)
Identify a tree by its leaf

handy step-by-step clickthrough guide

(tags: leaf tree nature identification plant)
Outlier Detection at Netflix | Hacker News

Excellent HN thread re automated anomaly detection in production, Q&A with the dev team

(tags: machine-learning ml remediation anomaly-detection netflix ops time-series clustering)

Links for 2015-07-13

Published July 13, 2015

OkHttp

A new HTTP client library for Android and Java, with a lot of nice features:
HTTP/2 and SPDY support allows all requests to the same host to share a socket. Connection pooling reduces request latency (if SPDY isn’t available). Transparent GZIP shrinks download sizes. Response caching avoids the network completely for repeat requests. OkHttp perseveres when the network is troublesome: it will silently recover from common connection problems. If your service has multiple IP addresses OkHttp will attempt alternate addresses if the first connect fails. This is necessary for IPv4+IPv6 and for services hosted in redundant data centers. OkHttp initiates new connections with modern TLS features (SNI, ALPN), and falls back to TLS 1.0 if the handshake fails. Using OkHttp is easy. Its 2.0 API is designed with fluent builders and immutability. It supports both synchronous blocking calls and async calls with callbacks.

(tags: android http java libraries okhttp http2 spdy microservices jdk)
Eircode tech specs

via Ossian.

(tags: via:smytho tech-specs specs eircode addresses geocoding ireland mapping)
AWS Best Practices for DDoS Resiliency [pdf]

Reasonably solid white paper

(tags: ddos amazon aws security dos whitepapers pdf)

Links for 2015-07-11

Published July 11, 2015

Self-driving cars drive like your grandma

'Honestly, I don't think it will take long for other drivers to realize that self-driving cars are "easy targets" in traffic.' -- also, an insurance expert suggests that self-driving cars won't increase premiums

(tags: driving google cars traffic social insurance)
New Zealand's Harmful Digital Communications Act: Harmful to Everyone Except Online Harassers | Electronic Frontier Foundation

NZ's HDC Act gets the EFF thumbs-down

(tags: eff new-zealand nz hdc-act dmca trolls)

Links for 2015-07-10

Published July 10, 2015

Revised and much faster, run your own high-end cloud gaming service on EC2!

a g2.2xlarge provides decent Windows GPU performance over the internet, at about $0.53 per hour

(tags: gaming games ec2 amazon aws cloud windows hacks)

Links for 2015-06-25

Published June 25, 2015

jgc on Cloudflare's log pipeline

Cloudflare are running a 40-machine, 50TB Kafka cluster, ingesting at 15 Gbps, for log processing. Also: Go producers/consumers, capnproto as wire format, and CitusDB/Postgres to store rolled-up analytics output. Also using Space Saver (top-k) and HLL (counting) estimation algorithms.

(tags: logs cloudflare kafka go capnproto architecture citusdb postgres analytics streaming)
sjk

a command line tool for JVM diagnostic troubleshooting and profiling.

(tags: java jvm monitoring commandline jmx sjk tools ops)
peco

'Simplistic interactive filtering tool' -- live incremental-search filtering in a terminal window

(tags: cli shell terminal tools go peco interactive incremental-search search ui unix)

Links for 2015-06-24

Published June 24, 2015

CFSSL

Cloudflare's open source CA/PKI infrastructure app

(tags: cloudflare pki ca ssl tls ops)

Links for 2015-06-23

Published June 23, 2015

Google Cloud Platform announces new Container Registry

Yay. Sensible Docker registry pricing at last. Given the high prices, rough edges and slow performance of the other registry offerings, I'm quite happy to see this.
Google Container Registry helps make it easy for you to store your container images in a private and encrypted registry, built on Cloud Platform. Pricing for storing images in Container Registry is simple: you only pay Google Cloud Storage costs. Pushing images is free, and pulling Docker images within a Google Cloud Platform region is free (Cloud Storage egress cost when outside of a region). Container Registry is now ready for production use: * Encrypted and Authenticated - Your container images are encrypted at rest, and access is authenticated using Cloud Platform OAuth and transmitted over SSL * Fast - Container Registry is fast and can handle the demands of your application, because it is built on Cloud Storage and Cloud Networking. * Simple - If you’re using Docker, just tag your image with a gcr.io tag and push it to the registry to get started. Manage your images in the Google Developers Console. * Local - If your cluster runs in Asia or Europe, you can now store your images in ASIA or EU specific repositories using asia.gcr.io and eu.gcr.io tags.

(tags: docker registry google gcp containers cloud-storage ops deployment)
Docker at Shopify: From This-Looks-Fun to Production

Pragmatic evolution story, adding Docker as a packaging/deploy format for an existing production Capistrano/Rails fleet

(tags: docker ops deployment packaging shopify slides)
Semian

Hystrix-style Circuit Breakers and Bulkheads for Ruby/Rails, from Shopify

(tags: circuit-breaker bulkhead patterns architecture microservices shopify rails ruby networking reliability fallback fail-fast)
Brubeck, a statsd-compatible metrics aggregator - GitHub Engineering

GitHub's statsd replacement in C

(tags: github monitoring statsd c rewrites ops metrics)

Links for 2015-06-22

Published June 22, 2015

Patrick Shuff - Building A Billion User Load Balancer - SCALE 13x - YouTube

'Want to learn how Facebook scales their load balancing infrastructure to support more than 1.3 billion users? We will be revealing the technologies and methods we use to route and balance Facebook's traffic. The Traffic team at Facebook has built several systems for managing and balancing our site traffic, including both a DNS load balancer and a software load balancer capable of handling several protocols. This talk will focus on these technologies and how they have helped improve user performance, manage capacity, and increase reliability.' Can't find the standalone slides, unfortunately.

(tags: facebook video talks lbs load-balancing http https scalability scale linux)
Codeface

a good collection of coding fonts (via Tony Finch)

(tags: via:fanf fonts coding ui)
Facebook's Folly Futures

Finagle Futures ported to C++11

(tags: futures async c++ c++11 facebook coding callbacks threading)

Links for 2015-06-21

Published June 21, 2015

jwz on Inceptionism

"Shoggoth ovipositors":
So then they reach inside to one of the layers and spin the knob randomly to fuck it up. Lower layers are edges and curves. Higher layers are faces, eyes and shoggoth ovipositors. [....] But the best part is not when they just glitch an image -- which is a fun kind of embossing at one end, and the "extra eyes" filter at the other -- but is when they take a net trained on some particular set of objects and feed it static, then zoom in, and feed the output back in repeatedly. That's when you converge upon the platonic ideal of those objects, which -- it turns out -- tend to be Giger nightmare landscapes. Who knew. (I knew.)
This stuff is still boggling my mind. All those doggy faces! That is one dog-obsessed ANN.

(tags: neural-networks ai jwz funny shoggoths image-recognition hr-giger art inceptionism)

Links for 2015-06-20

Published June 20, 2015

Levenshtein automata can be simple and fast

Nice algorithm for fuzzy text search with a limited Levenshtein edit distance using a DFA

(tags: dfa algorithms levenshtein text edit-distance fuzzy-search search python)

Links for 2015-06-19

Published June 19, 2015

Discretized Streams: Fault Tolerant Stream Computing at Scale

The paper describing the innards of Spark Streaming and its RDD-based recomputation algorithm:
we use a data structure called Resilient Distributed Datasets (RDDs), which keeps data in memory and can recover it without replication by tracking the lineage graph of operations that were used to build it. With RDDs, we show that we can attain sub-second end-to-end latencies. We believe that this is sufficient for many real-world big data applications, where the timescale of the events tracked (e.g., trends in social media) is much higher.

(tags: rdd spark streaming fault-tolerance batch distcomp papers big-data scalability)
Improving testing by using real traffic from production

Gor, a very nice-looking tool to log and replay HTTP traffic, specifically designed to "tee" live traffic from production to staging for pre-release testing

(tags: gor performance testing http tcp packet-capture tests staging tee)
Git team workflows: merge or rebase?

Well-written description of the pros and cons. I'm a rebaser, fwiw. (via Darrell)

(tags: via:darrell git merging rebasing history git-log coding workflow dev teams collaboration github)
How to receive a million packets per second on Linux

To sum up, if you want a perfect performance you need to: Ensure traffic is distributed evenly across many RX queues and SO_REUSEPORT processes. In practice, the load usually is well distributed as long as there are a large number of connections (or flows). You need to have enough spare CPU capacity to actually pick up the packets from the kernel. To make the things harder, both RX queues and receiver processes should be on a single NUMA node.

(tags: linux networking performance cloudflare packets numa so_reuseport sockets udp)

Links for 2015-06-18

Published June 18, 2015

Inceptionism: Going Deeper into Neural Networks

This is amazing, and a little scary.
If we choose higher-level layers, which identify more sophisticated features in images, complex features or even whole objects tend to emerge. Again, we just start with an existing image and give it to our neural net. We ask the network: “Whatever you see there, I want more of it!” This creates a feedback loop: if a cloud looks a little bit like a bird, the network will make it look more like a bird. This in turn will make the network recognize the bird even more strongly on the next pass and so forth, until a highly detailed bird appears, seemingly out of nowhere.
An enlightening comment from the G+ thread:
This is the most fun we've had in the office in a while. We've even made some of those 'Inceptionistic' art pieces into giant posters. Beyond the eye candy, there is actually something deeply interesting in this line of work: neural networks have a bad reputation for being strange black boxes that that are opaque to inspection. I have never understood those charges: any other model (GMM, SVM, Random Forests) of any sufficient complexity for a real task is completely opaque for very fundamental reasons: their non-linear structure makes it hard to project back the function they represent into their input space and make sense of it. Not so with backprop, as this blog post shows eloquently: you can query the model and ask what it believes it is seeing or 'wants' to see simply by following gradients. This 'guided hallucination' technique is very powerful and the gorgeous visualizations it generates are very evocative of what's really going on in the network.?

(tags: art machine-learning algorithm inceptionism research google neural-networks learning dreams feedback graphics)

Links for 2015-06-17

Published June 17, 2015

Apple to switch APNS protocol to HTTP/2

This is great news -- the current protocol is a binary, proprietary horrorshow, particularly around error reporting. Available "later this year" in production, and Pushy plan to support it.

(tags: http2 apns pushy apple push-notifications protocols http)
Comparing the Defect Reduction Benefits of Code Inspection and Test-Driven Development

tl;dr: Code review trumps TDD alone for finding bugs. (Via Mark Dennehy)

(tags: via:markdennehy code-review coding tdd unit-tests testing papers bugs)
Evidence-Based Software Engineering

Objective: Our objective is to describe how software engineering might benefit from an evidence-based approach and to identify the potential difficulties associated with the approach. Method: We compared the organisation and technical infrastructure supporting evidence-based medicine (EBM) with the situation in software engineering. We considered the impact that factors peculiar to software engineering (i.e. the skill factor and the lifecycle factor) would have on our ability to practice evidence-based software engineering (EBSE). Results: EBSE promises a number of benefits by encouraging integration of research results with a view to supporting the needs of many different stakeholder groups. However, we do not currently have the infrastructure needed for widespread adoption of EBSE. The skill factor means software engineering experiments are vulnerable to subject and experimenter bias. The lifecycle factor means it is difficult to determine how technologies will behave once deployed. Conclusions: Software engineering would benefit from adopting what it can of the evidence approach provided that it deals with the specific problems that arise from the nature of software engineering.
(via Mark Dennehy)

(tags: papers toread via:markdennehy software coding ebse evidence-based-medicine medicine research)
Amazon offer a WhatsMyIp service as part of AWS

curl -s http://checkip.amazonaws.com/

(tags: checkip networking internet whats-my-ip ops)
Huge Loss For Free Speech In Europe: Human Rights Court Says Sites Liable For User Comments | Techdirt

The ruling is terrible through and through. First off, it insists that the comments on the news story were clearly "hate speech" and that, as such, "did not require any linguistic or legal analysis since the remarks were on their face manifestly unlawful." To the court, this means that it's obvious such comments should have been censored straight out. That's troubling for a whole host of reasons at the outset, and highlights the problematic views of expressive freedom in Europe. Even worse, however, the Court then notes that freedom of expression is "interfered with" by this ruling, but it doesn't seem to care -- saying that it is deemed "necessary in a democratic society."
This is going to have massive chilling effects. Terrible ruling from the ECHR.

(tags: echr freedom via:tjmcintyre law europe eu comments free-speech censorship hate-speech)
Shock European court decision: Websites are liable for users’ comments | Ars Technica

In the wake of this judgment, the legal situation is complicated. In an e-mail to Ars, T J McIntyre, who is a lecturer in law and Chairman of Digital Rights Ireland, the lead organization that won an important victory against EU data retention in the Court of Justice of the European Union last year, explained where things now stand. "Today's decision doesn't have any direct legal effect. It simply finds that Estonia's laws on site liability aren't incompatible with the ECHR. It doesn't directly require any change in national or EU law. Indirectly, however, it may be influential in further development of the law in a way which undermines freedom of expression. As a decision of the Grand Chamber of the ECHR it will be given weight by other courts and by legislative bodies."

(tags: ars-technica delfi free-speech eu echr tj-mcintyre law europe estonia)
Google Cloud Platform Blog: A look inside Google’s Data Center Networks

We used three key principles in designing our datacenter networks: We arrange our network around a Clos topology, a network configuration where a collection of smaller (cheaper) switches are arranged to provide the properties of a much larger logical switch. We use a centralized software control stack to manage thousands of switches within the data center, making them effectively act as one large fabric. We build our own software and hardware using silicon from vendors, relying less on standard Internet protocols and more on custom protocols tailored to the data center.

(tags: clos-networks google data-centers networking sdn gcp ops)
Automated Nginx Reverse Proxy for Docker

Nice hack. An automated nginx reverse proxy which regenerates as the Docker containers update

(tags: nginx reverse-proxy proxies web http ops docker)
6 Reasons Modern Movie CGI Looks Surprisingly Crappy

Spot on

(tags: color-grading teal-and-orange cgi movies film sfx jurassic-world)

Links for 2015-06-16

Published June 16, 2015

Cover Story: “Playdate” - The New Yorker

the story behind Chris Ware's lovely Minecraft New Yorker cover

(tags: minecraft chris-ware art kids play gaming games)

Links for 2015-06-15

Published June 15, 2015

How We Moved Our API From Ruby to Go and Saved Our Sanity

Parse on their ditching-Rails story. I haven't heard a nice thing about Ruby or Rails as an operational, production-quality platform in a long time :(

(tags: go ruby rails ops parse languages platforms)
VPC Flow Logs

we are introducing Flow Logs for the Amazon Virtual Private Cloud. Once enabled for a particular VPC, VPC subnet, or Elastic Network Interface (ENI), relevant network traffic will be logged to CloudWatch Logs for storage and analysis by your own applications or third-party tools. You can create alarms that will fire if certain types of traffic are detected; you can also create metrics to help you to identify trends and patterns. The information captured includes information about allowed and denied traffic (based on security group and network ACL rules). It also includes source and destination IP addresses, ports, the IANA protocol number, packet and byte counts, a time interval during which the flow was observed, and an action (ACCEPT or REJECT).

(tags: ec2 aws vpc logging tracing ops flow-logs network tcpdump packets packet-capture)
Tim Hunt "jokes" about women scientists. Or not. (with image, tweets) · deborahblum · Storify

'[Tim Hunt] said that while he meant to be ironic, he did think it was hard to collaborate with women because they are too emotional - that he was trying to be honest about the problems.' So much for the "nasty twitter took my jokes seriously" claims then.

(tags: twitter science misogyny women tim-hunt deborah-blum journalism)
Why I dislike systemd

Good post, and hard to disagree.
One of the "features" of systemd is that it allows you to boot a system without needing a shell at all. This seems like such a senseless manoeuvre that I can't help but think of it as a knee-jerk reaction to the perception of Too Much Shell in sysv init scripts. In exactly which universe is it reasonable to assume that you have a running D-Bus service (or kdbus) and a filesystem containing unit files, all the binaries they refer to, all the libraries they link against, and all the configuration files any of them reference, but that you lack that most ubiquitous of UNIX binaries, /bin/sh?

(tags: history linux unix systemd bsd system-v init ops dbus)
Adrian Colyer reviews the Twitter Heron paper

ouch, really sounds like Storm didn't cut the muster. 'It’s hard to imagine something more damaging to Apache Storm than this. Having read it through, I’m left with the impression that the paper might as well have been titled “Why Storm Sucks”, which coming from Twitter themselves is quite a statement.' If I was to summarise the lessons learned, it sounds like: backpressure is required; and multi-tenant architectures suck.

(tags: storm twitter heron big-data streaming realtime backpressure)

Links for 2015-06-14

Published June 14, 2015

Security theatre at Allied Irish Banks

Allied Irish Banks's web and mobile banking portals are ludicrously insecure. Vast numbers of accounts have easily-guessable registration numbers and are thus 'protected' by a level of security that is twice as easy to crack as would be provided by a single password containing only two lowercase letters. A person of malicious intent could easily gain access to hundreds, possibly thousands, of accounts as well as completely overwhelm the branch network by locking an estimated several 100,000s of people out of their online banking. Both AIB and the Irish Financial Services Ombudsman have refused to respond meaningfully to multiple communications each in which these concerns were raised privately.

(tags: aib banking security ireland hacking ifso online-banking)
Leveraging AWS to Build a Scalable Data Pipeline

Nice detailed description of an auto-scaled SQS worker pool

(tags: sqs aws ec2 auto-scaling asg worker-pools architecture scalability)

Links for 2015-06-13

Published June 13, 2015

China’s Spies Hit the Blackmail Jackpot With Data on 4 Million Federal Workers

The Daily Beast is scathing re the OPM hack:
Here’s where things start to get scary. Whoever has OPM’s records knows an astonishing amount about millions of federal workers, members of the military, and security clearance holders. They can now target those Americans for recruitment or influence. After all, they know their vices, every last one—the gambling habit, the inability to pay bills on time, the spats with former spouses, the taste for something sexual on the side—since all that is recorded in security clearance paperwork. (To get an idea of how detailed this gets, you can see the form, called an SF86, here.) Speaking as a former counterintelligence officer, it really doesn’t get much worse than this.

(tags: daily-beast sf86 clearance us-government america china cyberwar hacking opm privacy)

Links for 2015-06-12

Published June 12, 2015

For a Good Strftime

'Easy Skeezy Ruby Date/Time Formatting' -- or indeed anywhere else strftime() is supported

(tags: strftime time date formatting coding ruby via:oisin)
etcd Clustering in AWS

'a fully-automated solution to build auto-scaling etcd clusters in AWS'

(tags: aws cluster docker etcd asg autoscaling ops)

Links for 2015-06-11

Published June 11, 2015

Facebook Infer

New static analysis goodnews, freshly open-sourced by Facebook:
Facebook Infer uses logic to do reasoning about a program's execution, but reasoning at this scale — for large applications built from millions of lines of source code — is hard. Theoretically, the number of possibilities that need to be checked is more than the number of estimated atoms in the observable universe. Furthermore, at Facebook our code is not a fixed artifact but an evolving system, updated frequently and concurrently by many developers. It is not unusual to see more than a thousand modifications to our mobile code submitted for review in a given day. The requirements on the program analyzer then become even more challenging because we expect a tool to report quickly on these code modifications — in the region of 10 minutes — to fit in with developers' workflow. Coping with this scale and velocity requires advanced mathematical techniques. Facebook Infer uses two such techniques: separation logic and bi-abduction. Separation logic is a theory that allows Facebook Infer's analysis to reason about small, independent parts of the application storage, rather than having to consider the entirety of the memory potentially at every step. That would be a daunting task on modern processors with their large addressable virtual memories. Bi-abduction is a logical inference technique that allows Facebook Infer to discover properties about the behavior of independent parts of the application code. By storing these properties between runs, Facebook Infer needs to analyze only the parts of the software that have changed, reusing the results of its previous analysis where it can. By combining these approaches, our analyzer is able to find complex problems in modifications to an application built from millions of lines of code, in minutes.
(via Bryan O'Sullivan)

(tags: via:bos infer facebook static-analysis lint code java ios android coding bugs)
The Tamborzão Goes to Thailand

This is great. the story of how cheesy funk carioca tune “A Minha Amiga Fran” managed to become "Kawo Kawo" and become a massive hit in Thailand

(tags: thai brazil carioca music dance-music kawo-kawo)

Links for 2015-06-10

Published June 10, 2015

AV vendors still relying on MD5 to identify malware

oh dear. I can see how this happened -- in many cases they may not still have samples to derive new sums from :(

(tags: md5 hashing antivirus malware security via:fanf bugs)
Google Photos - Can I get out?

what's the export policy for Google's new Photos service? pretty good, it turns out

(tags: google export data google-photos photos archive history storage)
A higher order estimate of the optimum checkpoint interval for restart dumps

tl;dr:
the bottom line is as follows: If the time it takes to create a dump, ? < M/2 then use ?opt = ?(2?M) – ? Otherwise (it takes longer than M/2 to create a dump), just use ?opt = M.

(tags: dumping periodic-tasks scheduling frequency maths optimal interval checkpointing)

Links for 2015-06-09

Published June 9, 2015

Dogestry

Simple CLI app for storing Docker image on Amazon S3.

(tags: dogestry registry docker s3 github)

Links for 2015-06-08

Published June 8, 2015

Testing@LMAX – Aliases

Creating a user with our DSL looks like: registrationAPI.createUser("user"); You might expect this to create a user with the username ‘user’, but then we’d get conflicts between every test that wanted to call their user ‘user’ which would prevent tests from running safely against the same deployment of the exchange. Instead, ‘user’ is just an alias that is only meaningful while this one test is running. The DSL creates a unique username that it uses when talking to the actual system. Typically this is done by adding a postfix so the real username is still reasonably understandable e.g. user-fhoai42lfkf.
Nice approach -- makes sense.

(tags: testing lmax system-tests naming coding)
Orbit Async

Orbit Async implements async-await methods in the JVM. It allows programmers to write asynchronous code in a sequential fashion. It was developed by BioWare, a division of Electronic Arts.
Open source, BSD-licensed.

(tags: async await java jvm bioware coding threading)
Who wrote this amazing, mysterious book satirizing tech startup culture?

very cool

(tags: books reading startups silicon-valley mysteries pranks san-francisco)
1172401 – Add Amazon root certificates

Well, well -- looks like AWS is about to disrupt PKI, and about time too. If they come up with a Plex-style "provision a cert" API, it'll be revolutionary

(tags: pki ssl tls amazon aws apis web-services ops)

Links for 2015-06-07

Published June 7, 2015

Vintage Illustrations for Tolkien’s The Hobbit from Around the World | Brain Pickings

including a lovely set from Tove Jansson

(tags: tove-jansson art illustration tolkien the-hobbit books via:ianmoore)
How Plex is doing HTTPS for all its users

large-scale automated TLS certificate deployment. very impressive and not easy to reproduce, good work Plex! (via Nelson)

(tags: via:nelson https ssl tls certificates pki digicert security plex)

Links for 2015-06-06

Published June 6, 2015

Tuning Java Garbage Collection for Spark Applications

So much for G1GC being fire-and-forget

(tags: g1gc gc java jvm performance spark ops tuning)

Links for 2015-06-05

Published June 5, 2015

Airflow

Airbnb's workflow management system; works off a DAG defined in Python code (ugh). Nice UI though, but I think Pinboard's take is neater

(tags: airbnb open-source python workflow jobs cron scheduling batch)
A Complete Taxonomy of Internet Chum - The Awl

Introducing the chumbox

(tags: chum chumbox spam ads web content)
Buck

A high-performance java build tool, from Facebook. Make-like

(tags: android build java make coding facebook)

Links for 2015-06-04

Published June 4, 2015

Twitter ditches Storm

in favour of a proprietary ground-up rewrite called Heron. Reading between the lines it sounds like Storm had problems with latency, reliability, data loss, and supporting back pressure.

(tags: analytics architecture twitter storm heron backpressure streaming realtime queueing)
Hybrid Logical Clocks

neat substitute for physical-time clocks in synchronization and ordering in a distributed system, based on Lamport's Logical Clocks and Google's TrueTime. 'HLC captures the causality relationship like LC, and enables easy identification of consistent snapshots in distributed systems. Dually, HLC can be used in lieu of PT clocks since it maintains its logical clock to be always close to the PT clock.'

(tags: hlc clocks logical-clocks time synchronization ordering events logs papers algorithms truetime distcomp)
Me vs An Post

Increasingly bizarre postal address obfuscation with An Post, the Irish postal service. Example:
I have decided to see what you can post [....] My first experiment was a dice [sic] with one line of the address on each side. An Post delivered two days later. They win this round
Via JG

(tags: fun an-post post games funny tumblr via:johngilbert)
Netty's async DNS resolver

'Can do ~1M queries to ~3K public DNS servers within ~3 minutes with just a few threads.' via Trustin Lee. Netty is the business

(tags: netty dns async crawlers resolver benchmarks scanning)

Links for 2015-06-03

Published June 3, 2015

Performance Testing at LMAX

Good series of blog posts on the LMAX trading platform's performance testing strategy -- they capture live traffic off the wire, then build statistical models simulating its features. See also http://epickrram.blogspot.co.uk/2014/07/performance-testing-at-lmax-part-two.html and http://epickrram.blogspot.co.uk/2014/08/performance-testing-at-lmax-part-three.html .

(tags: performance testing tests simulation latency lmax trading sniffing packet-capture)
The Violence of Algorithms: Why Big Data Is Only as Smart as Those Who Generate It

The modern state system is built on a bargain between governments and citizens. States provide collective social goods, and in turn, via a system of norms, institutions, regulations, and ethics to hold this power accountable, citizens give states legitimacy. This bargain created order and stability out of what was an increasingly chaotic global system. If algorithms represent a new ungoverned space, a hidden and potentially ever-evolving unknowable public good, then they are an affront to our democratic system, one that requires transparency and accountability in order to function. A node of power that exists outside of these bounds is a threat to the notion of collective governance itself. This, at its core, is a profoundly undemocratic notion—one that states will have to engage with seriously if they are going to remain relevant and legitimate to their digital citizenry who give them their power.

(tags: palantir algorithms big-data government democracy transparency accountability analytics surveillance war privacy protest rights)

Links for 2015-06-02

Published June 2, 2015

Dong detection in LEGO Universe

great example of how Minecraft solved the problem the easy way -- by simply not making an MMO, the whole problem effectively goes away

(tags: penis funny games lego lego-universe minecraft gaming mmo ugc)
HTTP/2 is here, let's optimize! - Velocity SC 2015 - Google Slides

Changes which server-side developers will need to start considering as HTTP/2 rolls out. Remove domain sharding; stop concatenating resources; stop inlining resources; use server push.

(tags: http2 http protocols streaming internet web dns performance)
Five different ways to handle leap seconds with NTP

Without switching to chronyd, ntpd -x sounds not too suboptimal:
With ntpd, the kernel backward step is used by default. With ntpd versions before 4.2.6, or 4.2.6 and later patched for this bug, the -x option (added to /etc/sysconfig/ntpd) can be used to disable the kernel leap second correction and ignore the leap second as far as the local clock is concerned. The one-second error gained after the leap second will be measured and corrected later by slewing in normal operation using NTP servers which already corrected their local clocks.
It's all pretty messy though :(

(tags: ntpd ntp chronyd clocks time synchronization via:fanf linux leap-seconds)
The Agency - NYTimes.com

Russia's troll farms. Ladies and gentlemen -- the future

(tags: future abuse trolls russia trolling politics social-media twitter facebook)

Links for 2015-05-29

Published May 29, 2015

Ireland's media silenced over MP's speech about Denis O'Brien

this is appalling. And of course we can only find out about it from overseas media because our own media is quaking in their boots :(

(tags: media ireland he-who-cannot-be-named censorship omgwtfbbq law libel injunctions high-court)
How Ireland's same-sex marriage referendum played out on Twitter

nice clear data there

(tags: ireland ssm marref history twitter hashtags yesequality)
murbul comments on The security issue of Blockchain.info's Android Wallet is not about system's entropy. It's their own BUGs on PRNG again!

I was in the middle of writing a breakdown of what went wrong, but you've beat me to it. Basically, they have a LinuxSecureRandom class that's supposed to override the standard SecureRandom. This class reads from /dev/urandom and should provide cryptographically secure random values. They also seed the generator using SecureRandom#setSeed with data pulled from random.org. With their custom SecureRandom, this is safe because it mixes the entropy using XOR, so even if the random.org data is dodgy it won't reduce security. It's just an added bonus. BUT! On some devices under some circumstances, the LinuxSecureRandom class doesn't get registered. This is likely because /dev/urandom doesn't exist or can't be accessed for some reason. Instead of screaming bloody murder like any sensible implementation would, they just ignore that and fall back to using the standard SecureRandom. If the above happens, there's a problem because the default implementation of SecureRandom#setSeed doesn't mix. If you set the seed, it replaces the entropy entirely. So now the entropy is coming solely from random.org. And the final mistake: They were using HTTP instead of HTTPS to make the webservice call to random.org. On Jan 4, random.org started enforcing HTTPS and returning a 301 Permanently Moved error for HTTP - see https://www.random.org/news/. So since that date, the entropy has actually been the error message (turned into bytes) instead of the expected 256-bit number. Using that seed, SecureRandom will generate the private key for address 1Bn9ReEocMG1WEW1qYjuDrdFzEFFDCq43F 100% of the time. Ouch. This is around the time that address first appears, so the timeline matches. I haven't had a thorough look at what they've replaced it with in the latest version, but initial impressions are that it's not ideal. Not disastrous, but not good.
Always check return values; always check HTTP status codes.

(tags: bugs android fail securerandom random prng blockchain.info bitcoin http randomness entropy error-checking)
CommonMark

A strongly specified, highly compatible implementation of Markdown

(tags: reference markdown commonmark specs formatting text compatibility)
GitTorrent

'A Decentralized GitHub'. nifty

(tags: distributed git github bittorrent bitcoin gittorrent dvcs)

Links for 2015-05-28

Published May 28, 2015

I Fooled Millions Into Thinking Chocolate Helps Weight Loss

“Slim by Chocolate!” the headlines blared. A team of German researchers had found that people on a low-carb diet lost weight 10 percent faster if they ate a chocolate bar every day. It made the front page of Bild, Europe’s largest daily newspaper, just beneath their update about the Germanwings crash. From there, it ricocheted around the internet and beyond, making news in more than 20 countries and half a dozen languages. It was discussed on television news shows. It appeared in glossy print, most recently in the June issue of Shape magazine (“Why You Must Eat Chocolate Daily”, page 128). Not only does chocolate accelerate weight loss, the study found, but it leads to healthier cholesterol levels and overall increased well-being. The Bild story quotes the study’s lead author, Johannes Bohannon, Ph.D., research director of the Institute of Diet and Health: “The best part is you can buy chocolate everywhere.” I am Johannes Bohannon, Ph.D. Well, actually my name is John, and I’m a journalist. I do have a Ph.D., but it’s in the molecular biology of bacteria, not humans. The Institute of Diet and Health? That’s nothing more than a website. Other than those fibs, the study was 100 percent authentic. My colleagues and I recruited actual human subjects in Germany. We ran an actual clinical trial, with subjects randomly assigned to different diet regimes. And the statistically significant benefits of chocolate that we reported are based on the actual data. It was, in fact, a fairly typical study for the field of diet research. Which is to say: It was terrible science. The results are meaningless, and the health claims that the media blasted out to millions of people around the world are utterly unfounded.
Interesting bit: the online commenters commenting on the published stories quickly saw through the bullshit. Why can't the churnalising journos do that?

(tags: chocolate journalism science diet food churnalism pr bild health clinical-trials papers peer-review research)
Snake-Oil Superfoods

mainly interesting for the dataviz and the Google-Doc-driven backend. wish they published the script though

(tags: google snake-oil superfoods food dataviz bubble-race-chart graphics infographics google-docs spreadsheets)

Links for 2015-05-27

Published May 27, 2015

Three Questions to Answer When Reporting an Error

Very long, but tl;dr:
the trick to creating an effective error message is to answer the 3 Questions within your message: What is the error? What was the probable cause of the error? What is the probable remedy?

(tags: errors ui ux reporting logging coding)
Volvo says horrible 'self-parking car accident' happened because driver didn't have 'pedestrian detection'

Grim meathook future, courtesy of Volvo:
“The Volvo XC60 comes with City Safety as a standard feature however this does not include the Pedestrian detection functionality [...] The pedestrian detection feature [...] costs approximately $3,000.
However, there's another lesson here, in crappy car UX and the risks thereof:
But even if it did have the feature, Larsson says the driver would have interfered with it by the way they were driving and “accelerating heavily towards the people in the video.” “The pedestrian detection would likely have been inactivated due to the driver inactivating it by intentionally and actively accelerating,” said Larsson. “Hence, the auto braking function is overrided by the driver and deactivated.” Meanwhile, the people in the video seem to ignore their instincts and trust that the car assumed to be endowed with artificial intelligence knows not to hurt them. It is a sign of our incredible faith in the power of technology, but also, it’s a reminder that companies making AI-assisted vehicles need to make safety features standard and communicate clearly when they aren’t.

(tags: self-driving-cars cars ai pedestrian computer-vision volvo fail accidents grim-meathook-future)
iPhone UTF-8 text vulnerability

'Due to how the banner notifications process the Unicode text. The banner briefly attempts to present the incoming text and then "gives up" thus the crash'. Apparently the entire Springboard launcher crashes.

(tags: apple vulnerability iphone utf-8 unicode fail bugs springboard ios via:abetson)

Links for 2015-05-26

Published May 26, 2015

Schedule Recurring AWS Lambda Invocations With The Unreliable Town Clock (UTC)

The Unreliable Town Clock (UTC) is a new, free, public SNS Topic (Amazon Simple Notification Service) that broadcasts a “chime” message every quarter hour to all subscribers. It can send the chimes to AWS Lambda functions, SQS queues, and email addresses. You can use the chime attributes to run your code every fifteen minutes, or only run your code once an hour (e.g., when minute == "00") or once a day (e.g., when hour == "00" and minute == "00") or any other series of intervals. You can even subscribe a function you only want to run only once at a specific time in the future: Have the function ignore all invocations until it’s after the time it wants. When it is time, it can perform its job, then unsubscribe itself from the SNS Topic.

(tags: alestic aws lambda cron time clock periodic-tasks recurrence hacks)

Links for 2015-05-25

Published May 25, 2015

Soylent, Neoliberalism and the Politics of Life Hacking - CounterPunch: Tells the Facts, Names the Names

Soylent’s not purchased by the Mark Zuckerbergs or the Larry Pages or the other tech aristocrats [...] Rather, it’s been taken up by white-collar workers and students destined for perpetual toil in the digital mills. Their embrace of life hacking represents the internalisation of management practices by the managed themselves.

(tags: life-hacks soylent food politics taylorism efficiency capitalism work life)
Working with Apache Spark: Or, How I Learned to Stop Worrying and Love the Shuffle | Cloudera Engineering Blog

some good Spark optimization tips

(tags: spark performance optimization rdd emr big-data cloudera tips akka)
Elements of Scale: Composing and Scaling Data Platforms

Great, encyclopedic blog post rounding up common architectural and algorithmic patterns using in scalable data platforms. Cut out and keep!

(tags: architecture storage databases data big-data scaling scalability ben-stopford cqrs druid parquet columnar-stores lambda-architecture)
ISIS vs. 3D Printing | Motherboard

Morehshin Allahyari, an Iranian born artist, educator, and activist [....] is working on digitally fabricating [the] sculptures [ISIS destroyed] for a series called “Material Speculation” as part of a residency in Autodesk's Pier 9 program. The first in the series is “Material Speculation: ISIS,” which, through intense research, is modeling and reproducing statues destroyed by ISIS in 2015. Allahyari isn't just interested in replicating lost objects but making it possible for anyone to do the same: Embedded within each semi-translucent copy is a flash drive with Allahyari’s research about the artifacts, and an online version is coming. In this way, “Material Speculation: ISIS,” is not purely a metaphorical affront to ISIS, but a practical one as well. Allahyari’s work is similar to conservation efforts, including web-based Project Mosul, a small team and group of volunteers that are three-dimensionally modeling ISIS-destroyed artifacts based on crowd-sourced photographs. "Thinking about 3D printers as poetic and practical tools for digital and physical archiving and documenting has been a concept that I've been interested in for the last three years,” Allahyari says. Once she began exploring the works, she discovered a thorough lack of documentation. Her research snowballed. “It became extremely important for me to think about ways to gather this information and save them for both current and future civilizations.”

(tags: 3d-printing fabrication scanning isis niniveh iraq morehshin-allahyari history preservation archives archival)

Links for 2015-05-24

Published May 24, 2015

Kubernetes for developers

great intro

(tags: kubernetes ops docker containers rocket deployment packaging)
A Piece of Apple II History Cracks Open — May 24, 2015

Lovely description of cracking (ie. copy-protection removal) in the Apple-II era. Very reminiscent of the equivalent in the C=64 scene, from my experience. ;)

(tags: history c=64 apple-ii personal-computers archive cracks copy-protection hacking)

Links for 2015-05-19

Published May 19, 2015

Deploying Elastic Beanstalk Applications from Docker Containers - Elastic Beanstalk

oh wow, this actually sounds pretty cool

(tags: docker aws ec2 beanstalk deployment ops containers)

Links for 2015-05-18

Published May 18, 2015

TIL we have more gravity than Canada

'Early gravity mapping efforts in the 1960s revealed that the Hudson Bay area in particular exerts a weaker gravitational force. Since less mass equals less gravity, there must be less mass underneath these areas.' informed!

(tags: gravity canada geode earth science hudson-bay mass)
SolarCapture Packet Capture Software

Interesting product line -- I didn't know this existed, but it makes good sense as a "network flight recorder". Big in finance.
SolarCapture is powerful packet capture product family that can transform every server into a precision network monitoring device, increasing network visibility, network instrumentation, and performance analysis. SolarCapture products optimize network monitoring and security, while eliminating the need for specialized appliances, expensive adapters relying on exotic protocols, proprietary hardware, and dedicated networking equipment.
See also Corvil (based in Dublin!): 'I'm using a Corvil at the moment and it's awesome- nanosecond precision latency measurements on the wire.' (via mechanical sympathy list)

(tags: corvil timing metrics measurement latency network solarcapture packet-capture financial performance security network-monitoring)
Top 10 data mining algorithms in plain English

This is a phenomenally useful ML/data-mining resource post -- 'the top 10 most influential data mining algorithms as voted on by 3 separate panels in [ICDM '06's] survey paper', but with a nice clear intro and description for each one. Here's the algorithms covered:
1. C4.5 2. k-means 3. Support vector machines 4. Apriori 5. EM 6. PageRank 7. AdaBoost 8. kNN 9. Naive Bayes 10. CART

(tags: svm k-means c4.5 apriori em pagerank adaboost knn naive-bayes cart ml data-mining machine-learning papers algorithms unsupervised supervised)
Developer believes he can turn digital game into global hit

g'wan the Colm!

(tags: colm-larkin guild-of-dungeoneering games press)
Trend Micro Locality Sensitive Hash

a fuzzy matching library. Given a byte stream with a minimum length of 512 bytes, TLSH generates a hash value which can be used for similarity comparisons. Similar objects will have similar hash values which allows for the detection of similar objects by comparing their hash values. Note that the byte stream should have a sufficient amount of complexity. For example, a byte stream of identical bytes will not generate a hash value.
Paper here: https://drive.google.com/file/d/0B6FS3SVQ1i0GTXk5eDl3Y29QWlk/edit via adulau

(tags: nilsimsa sdhash ssdeep locality-sensitive hashing algorithm hashes trend-micro tlsh hash fuzzy-matching via:adulau)
Eric Brewer interview on Kubernetes

What is the relationship between Kubernetes, Borg and Omega (the two internal resource-orchestration systems Google has built)? I would say, kind of by definition, there’s no shared code but there are shared people. You can think of Kubernetes?—?especially some of the elements around pods and labels?—?as being lessons learned from Borg and Omega that are, frankly, significantly better in Kubernetes. There are things that are going to end up being the same as Borg?—?like the way we use IP addresses is very similar?—?but other things, like labels, are actually much better than what we did internally. I would say that’s a lesson we learned the hard way.

(tags: google architecture kubernetes docker containers borg omega deployment ops)

Links for 2015-05-17

Published May 17, 2015

'Can People Distinguish Pâté from Dog Food?'

Ugh.
Considering the similarity of its ingredients, canned dog food could be a suitable and inexpensive substitute for pâté or processed blended meat products such as Spam or liverwurst. However, the social stigma associated with the human consumption of pet food makes an unbiased comparison challenging. To prevent bias, Newman's Own dog food was prepared with a food processor to have the texture and appearance of a liver mousse. In a double-blind test, subjects were presented with five unlabeled blended meat products, one of which was the prepared dog food. After ranking the samples on the basis of taste, subjects were challenged to identify which of the five was dog food. Although 72% of subjects ranked the dog food as the worst of the five samples in terms of taste (Newell and MacFarlane multiple comparison, P<0.05), subjects were not better than random at correctly identifying the dog food.

(tags: pate food omgwtf science research dog-food meat economics taste flavour)
Redditor runs the secret Python code in Ex Machina

and finds:
when you run with python2.7 you get the following: ISBN = 9780199226559 Which is Embodiment and the inner life: Cognition and Consciousness in the Space of Possible Minds. and so now I have a lot more respect for the Director.

(tags: python movies ex-machina cool books easter-eggs)
Metalwoman beer recipe

via the Dublin Ladies Beer Society ;)

(tags: metalman metalwoman recipes beer brewing hops dlbs)

Links for 2015-05-15

Published May 15, 2015

Linux futex_wait() bug

major bug in kernel versions 3.14 - 3.18 on Haswell hardware

(tags: haswell linux futex_wait futexes kernel bugs hang)

Links for 2015-05-14

Published May 14, 2015

repo

'The multiple repository tool'. How Google kludged around the split-repo problem when you don't have a monorepo.

(tags: kludges git monorepo monorepi google android aosp repo coding version-control dvcs)
Declaratively Provision Docker Images Using Nix

I really wish Docker/CoreOS would look at copying some of the deterministic-build ideas from Nix; see also http://gregoryszorc.com/blog/2014/10/13/deterministic-and-minimal-docker-images/

(tags: build packaging docker nix nix-docker deterministic-builds nixos apollo brazil)
Please stop calling databases CP or AP

In his excellent blog post [...] Jeff Hodges recommends that you use the CAP theorem to critique systems. A lot of people have taken that advice to heart, describing their systems as “CP” (consistent but not available under network partitions), “AP” (available but not consistent under network partitions), or sometimes “CA” (meaning “I still haven’t read Coda’s post from almost 5 years ago”). I agree with all of Jeff’s other points, but with regard to the CAP theorem, I must disagree. The CAP theorem is too simplistic and too widely misunderstood to be of much use for characterizing systems. Therefore I ask that we retire all references to the CAP theorem, stop talking about the CAP theorem, and put the poor thing to rest. Instead, we should use more precise terminology to reason about our trade-offs.

(tags: cap databases storage distcomp ca ap cp zookeeper consistency reliability networking)

Links for 2015-05-12

Published May 12, 2015

Input: Fonts for Code

Non-monospaced coding fonts! I'm all in favour...
As writing and managing code becomes more complex, today’s sophisticated coding environments are evolving to include everything from breakpoint markers to code folding and syntax highlighting. The typography of code should evolve as well, to explore possibilities beyond one font style, one size, and one character width.

(tags: input fonts via:its typography code coding font text ide monospace)
Apache HTrace

a Zipkin-compatible distributed-system tracing framework in Java, in the Apache Incubator

(tags: zipkin tracing trace apache incubator java debugging)
Intel speeds up etcd throughput using ADR Xeon-only hardware feature

To reduce the latency impact of storing to disk, Weaver’s team looked to buffering as a means to absorb the writes and sync them to disk periodically, rather than for each entry. Tradeoffs? They knew memory buffers would help, but there would be potential difficulties with smaller clusters if they violated the stable storage requirement. Instead, they turned to Intel’s silicon architects about features available in the Xeon line. After describing the core problem, they found out this had been solved in other areas with ADR. After some work to prove out a Linux OS supported use for this, they were confident they had a best-of-both-worlds angle. And it worked. As Weaver detailed in his CoreOS Fest discussion, the response time proved stable. ADR can grab a section of memory, persist it to disk and power it back. It can return entries back to disk and restore back to the buffer. ADR provides the ability to make small (<100MB) segments of memory “stable” enough for Raft log entries. It means it does not need battery-backed memory. It can be orchestrated using Linux or Windows OS libraries. ADR allows the capability to define target memory and determine where to recover. It can also be exposed directly into libs for runtimes like Golang. And it uses silicon features that are accessible on current Intel servers.

(tags: kubernetes coreos adr performance intel raft etcd hardware linux persistence disk storage xeon)

Links for 2015-05-11

Published May 11, 2015

streamtools: a graphical tool for working with streams of data | nytlabs

Visual programming, Yahoo! Pipes style, back again:
we have created streamtools – a new, open source project by The New York Times R&D Lab which provides a general purpose, graphical tool for dealing with streams of data. It provides a vocabulary of operations that can be connected together to create live data processing systems without the need for programming or complicated infrastructure. These systems are assembled using a visual interface that affords both immediate understanding and live manipulation of the system.
via Aman

(tags: via:akohli streaming data nytimes visual-programming coding)
MappedBus

a Java based low latency, high throughput message bus, built on top of a memory mapped file; inspired by Java Chronicle with the main difference that it's designed to efficiently support multiple writers – enabling use cases where the order of messages produced by multiple processes are important. MappedBus can be also described as an efficient IPC mechanism which enable several Java programs to communicate by exchanging messages.

(tags: ipc java jvm mappedbus low-latency mmap message-bus data-structures queue message-passing)

Links for 2015-05-10

Published May 10, 2015

Amazon's Drone Delivery Patent Just Feels Like Trolling At This Point

Oh dear, Amazon.
These aren’t actual technologies yet. [...] All of which underscores that Amazon might never ever ever ever actually implement delivery drones. The patent paperwork was filed nearly a year after Amazon’s splashy drone program reveal on 60 Minutes. At the time we called it revolutionary marketing because, you know, delivery drones are technical and logistical madness, not to mention that commercial drone use is illegal right now. Although, in fairness the FAA did just relax some rules so that Amazon could test drones. At this point it feels like Amazon is just trolling. It’s trolling us with public relations BS about its future drones, and it’s trolling future competitors -- Google is also apparently working on this -- so that if somebody ever somehow does anything relating to drone delivery, Amazon can sue them. If I’m wrong, I’ll deliver my apology via Airmail.

(tags: amazon trolling patents uspto delivery drones uavs competition faa)
Red Hat on rkt vs Docker

This is like watching a train-wreck in slow motion on Groundhog Day. We, in the broader Linux and open source community, have been down this path multiple times over the past fifteen years, specifically with package formats. While there needs to be room for experimentation, having two incompatible specs driven by two startups trying to differentiate and in direct competition is *not* a good thing. It would be better for the community and for everyone who depends on our collective efforts if CoreOS and Docker collaborated on a standardized common spec, image format, and distribution protocol. To this end, we at Red Hat will continue to contribute to both initiatives with the goal of driving convergence.

(tags: rkt docker appc coreos red-hat dpkg rpm linux packaging collaboration open-source)

Links for 2015-05-09

Published May 9, 2015

Migration to, Expectations, and Advanced Tuning of G1GC

Bookmarking for future reference. recommended by one of the GC experts, I can't recall exactly who ;)

(tags: gc g1gc jvm java tuning performance ops migration)
Deploy a registry - Docker Documentation

Looks like it's pretty feasible to run a private Docker registry on every host, backed by S3 (according to the ECS team's AMA). SPOF-free -- handy

(tags: docker registry ops deployment s3)
How to change Gradle cache location

$GRADLE_USER_HOME, basically -- it may also be possible to set from the Gradle script itself too

(tags: gradle build caching environment unix cache)
Internet of 404's

"An archive of the former Internet of Things"

(tags: archive iot things internet nabaztag startups acquisitions tumblr gadgets history)
Memory Layouts for Binary Search

Key takeaway:
Nearly universally, B-trees win when the data gets big enough.

(tags: caches cpu performance optimization memory binary-search b-trees algorithms search memory-layout)
Understanding the Docker Cache for Faster Builds

good advice. see also the Best Practices official doc at https://docs.docker.com/articles/dockerfile_best-practices/

(tags: docker build packaging cache best-practices tips)

Links for 2015-05-08

Published May 8, 2015

Your Google Algorithm Cheat Sheet: Panda, Penguin, and Hummingbird

Interesting that GOOG are still doing these big-bang releases -- I guess crunching the data to come up with new weights/rules is a heavyweight, time-consuming process

(tags: google search ranking releases panda penguin hummingbird weighting)
Dublin Bike Theft Survey Results

Dublin Cycling Campaign's survey results: estimated 20,000 bikes stolen per year in Dublin; only 1% of thefts results in a conviction

(tags: dublin bikes cycling theft crime statistics infographics dcc)
DRUG PUMP’S SECURITY FLAW LETS HACKERS RAISE DOSE LIMITS

The Hospira drug pump vulnerabilities described here sound pretty horrific

(tags: drugs drug-pumps hospira exploits vulnerabilities security root dosage limits)
Making End-to-End Tests Work

+1 to ALL of this. We are doing exactly the same in Swrve and it has radically improved our release quality

(tags: end-to-end testing acceptance-tests tests system-tests lmax)
How to do named entity recognition: machine learning oversimplified

Good explanation of this NLP tokenization/feature-extraction technique. Example result: "Jimi/B-PER Hendrix/I-PER played/O at/O Woodstock/B-LOC ./O"

(tags: named-entities feature-extraction tokenization nlp ml algorithms machine-learning)
The Discovery of Apache ZooKeeper's Poison Packet - PagerDuty

Excellent deep dive into a production issue. Root causes: crappy error handling code in Zookeeper; lack of bounds checking in ZK; and a nasty kernel bug.

(tags: zookeeper bugs error-handling bounds-checking oom poison-packets pagerduty packets tcpdump xen aes linux kernel)
The Injector: A new Executor for Java

This honestly fits a narrow niche, but one that is gaining in popularity. If your messages take > 100?s to process, or your worker threads are consistently saturated, the standard ThreadPoolExecutor is likely perfectly adequate for your needs. If, on the other hand, you’re able to engineer your system to operate with one application thread per physical core you are probably better off looking at an approach like the LMAX Disruptor. However, if you fall in the crack in between these two scenarios, or are seeing a significant portion of time spent in futex calls and need a drop in ExecutorService to take the edge off, the injector may well be worth a look.

(tags: performance java executor concurrency disruptor algorithms coding threads threadpool injector)

Links for 2015-05-07

Published May 7, 2015

KillBiller

Excellent mobile-phone plan comparison site for the Irish market, using apps which you install and which analyse your call history, data usage, etc. over the past month to compute the optimal plan based on your usage. Pretty amazing results in my case! The only downside is the privacy policy, which allows the company to resell your usage data (anonymised, and in aggregate) -- I'd really prefer if this wasn't the case :(

(tags: mobile-phones shopping tesco emobile 3g 4g ireland plans comparison-shopping killbiller via:its)
Family in No poster Says YES to Marriage Equality | Amnesty International

Beyond the politics, the risks of stock photo usage are pretty evident too:
"In 2014, as a young family, we did a photo shoot with a photographer friend to get some nice shots for the family album. No money was exchanged – we got nice photos for free, they got nice images for their portfolio. As part of this agreement, we agreed to let them upload them to a stock photo album. We knew that these were available for purchase and we gave permission. Perhaps, naïvely, we imagined that on the off chance that any was ever selected, it might be for a small magazine or website. To confirm, we have not received any money for the photo – then or now, and nor do we expect any. We were surprised and upset to see that the photo was being used as part of a campaign with which we do not agree. We completely support same-sex marriage, and we believe that same-sex couples’ should of course be able to adopt, as we believe that they are equally able to provide children with much-needed love and care. To suggest otherwise is offensive to us, and to many others."

(tags: ssm ireland politics amnesty stock-photos ip rights photos campaigns ads)
Lambda: Bees with Frickin' Laser Beams

a HTTP testing tool in AWS Lambda. nice enough, but still a toy...

(tags: lambda aws node javascript hacks http load-testing)

Links for 2015-05-06

Published May 6, 2015

Why Loggly loves Apache Kafka

Some good factoids about Loggly's Kafka usage and scales

(tags: scalability logging loggly kafka queueing ops reliabilty)
Patterns for building a resilient and scalable microservices platform on AWS

Some good details from Boyan Dimitrov at Hailo, on their orchestration, deployment, provisioning infra they've built

(tags: deployment ops devops hailo microservices platform patterns slides)
hyperlogsandwich

A probabilistic data structure for frequency/k-occurrence cardinality estimation of multisets. Sample implementation
(via Patrick McFadin)

(tags: via:patrickmcfadin hyperloglog cardinality data-structures algorithms hyperlogsandwich counting estimation lossy multisets)
"Trash Day: Coordinating Garbage Collection in Distributed Systems"

Another GC-coordination strategy, similar to Blade (qv), with some real-world examples using Cassandra

(tags: blade via:adriancolyer papers gc distsys algorithms distributed java jvm latency spark cassandra)
Five Takeaways on the State of Natural Language Processing

Good overview of the state of the art in NLP nowadays. I particularly like word2vec interesting:
Embedding words as real-numbered vectors using a skip-gram, negative-sampling model (word2vec code) was mentioned in nearly every talk I attended. Either companies are using various word2vec implementations directly or they are building diffs off of the basic framework. Trained on large corpora, the vector representations encode concepts in a large dimensional space (usually 200-300 dim).
Quite similar to some tokenization approaches we experimented with in SpamAssassin, so I don't find this too surprising....

(tags: word2vec nlp tokenization machine-learning language parsing doc2vec skip-grams data-structures feature-extraction via:lemonodor)

Links for 2015-05-05

Published May 5, 2015

Smarter testing Java code with Spock Framework

hmm, looks quite nice as a potential next-gen JUnit replacement for unit tests

(tags: java testing bdd tests junit unit-tests spock via:trishagee)
Tots To Travel

'Baby Friendly Holidays | Child, Toddler & Family Villas | France | Spain | Portugal | Italy'. Joe swears by it, will give it a go next year

(tags: holidays vacation travel europe kids children via:joe)
How the NSA Converts Spoken Words Into Searchable Text - The Intercept

This hits the nail on the head, IMO:
To Phillip Rogaway, a professor of computer science at the University of California, Davis, keyword-search is probably the “least of our problems.” In an email to The Intercept, Rogaway warned that “When the NSA identifies someone as ‘interesting’ based on contemporary NLP methods, it might be that there is no human-understandable explanation as to why beyond: ‘his corpus of discourse resembles those of others whom we thought interesting'; or the conceptual opposite: ‘his discourse looks or sounds different from most people’s.' If the algorithms NSA computers use to identify threats are too complex for humans to understand, it will be impossible to understand the contours of the surveillance apparatus by which one is judged. All that people will be able to do is to try your best to behave just like everyone else.”

(tags: privacy security gchq nsa surveillance machine-learning liberty future speech nlp pattern-analysis cs)
awslabs/aws-lambda-redshift-loader

Load data into Redshift from S3 buckets using a pre-canned Lambda function. Looks like it may be a good example of production-quality Lambda

(tags: lambda aws ec2 redshift s3 loaders etl pipeline)
Call me maybe: Aerospike

'Aerospike offers phenomenal latencies and throughput -- but in terms of data safety, its strongest guarantees are similar to Cassandra or Riak in Last-Write-Wins mode. It may be a safe store for immutable data, but updates to a record can be silently discarded in the event of network disruption. Because Aerospike’s timeouts are so aggressive–on the order of milliseconds -- even small network hiccups are sufficient to trigger data loss. If you are an Aerospike user, you should not expect “immediate”, “read-committed”, or “ACID consistency”; their marketing material quietly assumes you have a magical network, and I assure you this is not the case. It’s certainly not true in cloud environments, and even well-managed physical datacenters can experience horrible network failures.'

(tags: aerospike outages cap testing jepsen aphyr databases storage reliability)

Links for 2015-05-04

Published May 4, 2015

Emojineering Part 1: Machine Learning for Emoji Trends - Instagram Engineering

Instagram figuring out meanings from Emoji usage contexts using ML. ????

(tags: instagram emoji cool language text internet web speech communication trends machine-learning analysis)
Call me maybe: Elasticsearch 1.5.0

tl;dr: Elasticsearch still hoses data integrity on partition, badly

(tags: elasticsearch reliability data storage safety jepsen testing aphyr partition network-partitions cap)

Links for 2015-05-02

Published May 2, 2015

In the privacy of your own home

I didn't know about this:
Last spring, as 41,000 runners made their way through the streets of Dublin in the city’s Women’s Mini Marathon, an unassuming redheaded man by the name of Candid Wueest stood on the sidelines with a scanner. He had built it in a couple of hours with $75 worth of parts, and he was using it to surreptitiously pick up data from activity trackers worn on the runners’ wrists. During the race, Wueest managed to collect personal info from 563 racers, including their names, addresses, and passwords, as well as the unique IDs of the devices they were carrying.

(tags: dublin candid-wueest privacy data marathon running iot activity-trackers)

Links for 2015-05-01

Published May 1, 2015

Cyclists! Why do they ride in the middle of the road?

sense!

(tags: cycling roads cars driving safety uk)

Links for 2015-04-30

Published April 30, 2015

David P. Reed on the history of UDP

'UDP was actually “designed” in 30 minutes on a blackboard when we decided pull the original TCP protocol apart into TCP and IP, and created UDP on top of IP as an alternative for multiplexing and demultiplexing IP datagrams inside a host among the various host processes or tasks. But it was a placeholder that enabled all the non-virtual-circuit protocols since then to be invented, including encapsulation, RTP, DNS, …, without having to negotiate for permission either to define a new protocol or to extend TCP by adding “features”.'

(tags: udp ip tcp networking internet dpr history protocols)
Oops: Instagram forgot to renew its SSL certificate

hooray for cert renewal pain

(tags: certs ssl renewal expiry instagram outages lifecycle web https)

Links for 2015-04-29

Published April 29, 2015

s3.amazonaws.com "certificate verification failed" errors due to crappy Verisign certs and overzealous curl policies

Seth Vargo is correct. Its not the bit length of the key which is at issue, its the signature algorithm. The entire keychain for the s3.awsamazon.com key is signed with SHA1withRSA: https://www.ssllabs.com/ssltest/analyze.html?d=s3.amazonaws.com&s=54.231.244.0&hideResults=on At issue is that the root verisign key has been marked as weak because of SHA1 and taken out of the curl bundle which is widely popular, and this issue will continue to cause more and more issues going forwards as that bundle makes it way into shipping o/s distributions and aws certification verification breaks.
'This is still happening and curl is now failing on my machine causing all sorts of fun issues (including breaking CocoaPods that are using S3 for storage).' -- @jmhodges This may be a contributory factor to the issue @nelson saw: https://nelsonslog.wordpress.com/2015/04/28/cyberduck-is-responsible-for-my-bad-ssl-certificate/ Curl's ca-certs bundle is also used by Node: https://github.com/joyent/node/issues/8894 and doubtless many other apps and packages. Here's a mailing list thread discussing the issue: http://curl.haxx.se/mail/archive-2014-10/0066.html -- looks like the curl team aren't too bothered about it.

(tags: curl s3 amazon aws ssl tls certs sha1 rsa key-length security cacerts)
Cassandra moving to using G1 as the default recommended GC implementation

This is a big indicator that G1 is ready for primetime. CMS has long been the go-to GC for production usage, but requires careful, complex hand-tuning -- if G1 is getting to a stage where it's just a case of giving it enough RAM, that'd be great. Also, looks like it'll be the JDK9 default: https://twitter.com/shipilev/status/593175793255219200

(tags: cassandra tuning ops g1gc cms gc java jvm production performance memory)
The Colossal Shop

ThisIsColossal now have a shop! bookmarking for some lovely gifts

(tags: art design shop colossal shopping christmas gifts)

Links for 2015-04-28

Published April 28, 2015

Eight lessons learned hacking on GitHub Pages for six months

Pages is actually pretty solid -- nice one GitHub

(tags: github api pages html web jekyll hosting)
ShellCheck

Static code analysis for shell scripts (via Tony Finch)

(tags: bash cli sh linux shell coding static-analysis lint)
'Microservice AntiPatterns'

presentation from last week's Craft Conference in Budapest; Tammer Saleh of Pivotal with a few antipatterns observed in dealing with microservices.

(tags: microservices soa architecture design coding software presentations slides tammer-saleh pivotal craft)
Kappa

'a command line tool that (hopefully) makes it easier to deploy, update, and test functions for AWS Lambda.' much needed IMO -- Lambda is too closed

(tags: aws lambda mitch-garnaat coding testing cli kappa)
Vault

HashiCorp's take on the secrets-storage system. looks good

(tags: hashicorp deployment security secrets authentication vault storage keys key-rotation)

Links for 2015-04-27

Published April 27, 2015

Everything Science Knows Right Now About Standing Desks | Co.Design

"Overall, current evidence suggests that both standing and treadmill desks may be effective in improving overall health considering both physiological and mental health components."

(tags: standing-desks treadmill-desks desks exercise health work workplace back sitting standing)
Race conditions on Facebook, DigitalOcean and others

good trick -- exploit eventual consistency and a lack of distributed transactions by launching race-condition-based attacks

(tags: attacks exploits race-conditions bugs eventual-consistency distributed-transactions http facebook digitalocean via:aphyr)

Links for 2015-04-26

Published April 26, 2015

StackShare

'Discover and discuss the best dev tools and cloud infrastructure services' -- fun!

(tags: stackshare architecture stack ops software ranking open-source)
OWASP KeyBox

a web-based SSH console that centrally manages administrative access to systems. Web-based administration is combined with management and distribution of user's public SSH keys. Key management and administration is based on profiles assigned to defined users. Administrators can login using two-factor authentication with FreeOTP or Google Authenticator . From there they can create and manage public SSH keys or connect to their assigned systems through a web-shell. Commands can be shared across shells to make patching easier and eliminate redundant command execution.

(tags: keybox owasp security ssh tls ssl ops)
32-bit overflow in BitGo js code caused an accidental 85 BTC transaction fee

Yes, this is a fucking 32-bit integer overflow. Whatever software was used, it calculated the sum of all inputs using 32-bit variables, which overflow at about 20 BTC if signed or 40 BTC if not. The fee was supposed to be 0xC350 = 50,000 satoshis, but it turned out to be 0x2,0000,C350 = 8,589,984,592 satoshis. Captains of the industry. If they were captains of any other industry, like say for example automotive, we'd have people dying in car crashes between two stationary vehicles.

(tags: bitcoin fail bitgo javascript bugs 32-bit overflow btc)
Eight Docker Development Patterns

good Docker tips

(tags: tips docker ops deployment)
Google Online Security Blog: A Javascript-based DDoS Attack [the Greatfire DDoS] as seen by Safe Browsing

We hope this report helps to round out the overall facts known about this attack. It also demonstrates that collectively there is a lot of visibility into what happens on the web. At the HTTP level seen by Safe Browsing, we cannot confidently attribute this attack to anyone. However, it makes it clear that hiding such attacks from detailed analysis after the fact is difficult. Had the entire web already moved to encrypted traffic via TLS, such an injection attack would not have been possible. This provides further motivation for transitioning the web to encrypted and integrity-protected communication. Unfortunately, defending against such an attack is not easy for website operators. In this case, the attack Javascript requests web resources sequentially and slowing down responses might have helped with reducing the overall attack traffic. Another hope is that the external visibility of this attack will serve as a deterrent in the future.
Via Nelson.

(tags: google security via:nelson ddos javascript tls ssl safe-browsing networking china greatfire)

Links for 2015-04-24

Published April 24, 2015

Amazon EC2 Container Service team AmA

a few answers here. Mostly people pointing out shortcomings and the team asking them to start a thread on their forum though :(

(tags: ec2 ecs docker aws ops ama reddit)
Cluster-Based Architectures Using Docker and Amazon EC2 Container Service

In this post, we’re going to take a deeper dive into the architectural concepts underlying cluster computing using container management frameworks such as ECS. We will show how these frameworks effectively abstract the low-level resources such as CPU, memory, and storage, allowing for highly efficient usage of the nodes in a compute cluster. Building on some of the concepts detailed in the earlier posts, we will discover why containers are such a good fit for this type of abstraction, and how the Amazon EC2 Container Service fits into the larger ecosystem of cluster management frameworks.

(tags: docker aws ecs ec2 ops hosting containers mesos clusters)
Kubernetes compared to Borg

'Here are four Kubernetes features that came from our experiences with Borg.'

(tags: google ops kubernetes borg containers docker networking)

Links for 2015-04-23

Published April 23, 2015

attacks using U+202E - RIGHT-TO-LEFT OVERRIDE

Security implications of in-band signalling strikes again, 43 years after the "Blue Box" hit the mainstream. Jamie McCarthy on Twitter: ".@cmdrtaco - Remember when we had to block the U+202E code point in Slashdot comments to stop siht ekil stnemmoc? https://t.co/TcHxKkx9Oo" See also http://krebsonsecurity.com/2011/09/right-to-left-override-aids-email-attacks/ -- GMail was vulnerable too; and http://en.wikipedia.org/wiki/Unicode_control_characters for more inline control chars. http://unicode.org/reports/tr36/#Bidirectional_Text_Spoofing has some official recommendations from the Unicode consortium on dealing with bidi override chars.

(tags: security attacks rlo unicode control-characters codepoints bidi text gmail slashdot sanitization input)
Meet the e-voting machine so easy to hack, it will take your breath away | Ars Technica

The AVS WinVote system -- mind-bogglingly shitty security.
If an election was held using the AVS WinVote, and it wasn’t hacked, it was only because no one tried. The vulnerabilities were so severe, and so trivial to exploit, that anyone with even a modicum of training could have succeeded. They didn’t need to be in the polling place—within a few hundred feet (e.g., in the parking lot) is easy, and within a half mile with a rudimentary antenna built using a Pringles can. Further, there are no logs or other records that would indicate if such a thing ever happened, so if an election was hacked any time in the past, we will never know. I’ve been in the security field for 30 years, and it takes a lot to surprise me. But the VITA report really shocked me—as bad as I thought the problems were likely to be, VITA’s five-page report showed that they were far worse. And the WinVote system was so fragile that it hardly took any effort. While the report does not state how much effort went into the investigation, my estimation based on the description is that it was less than a person week.

(tags: security voting via:johnke winvote avs shoup wep wifi windows)

Links for 2015-04-22

Published April 22, 2015

'Continuous Deployment: The Dirty Details'

Good slide deck from Etsy's Mike Brittain regarding their CD setup. Some interesting little-known details: Slide 41: database schema changes are not CD'd -- they go out on "Schema change Thursdays". Slide 44: only the webapp is CD'd -- PHP, Apache, memcache components (Etsy.com, support and back-office tools, developer API, gearman async worker queues). The external "services" are not -- databases, Solr/JVM search (rolling restarts), photo storage (filters, proxy cache, S3), payments (PCI-DSS, controlled access). They avoid schema changes and breaking changes using an approach they call "non-breaking expansions" -- expose new version in a service interface; support multiple versions in the consumer. Example from slides 50-63, based around a database schema migration. Slide 66: "dev flags" (rollout oriented) are promoted to "feature flags" (long lived degradation control). Slide 71: some architectural philosophies: deploying is cheap; releasing is cheap; gathering data should be cheap too; treat first iterations as experiments. Slide 102: "Canary pools". They have multiple pools of users for testing in production -- the staff pool, users who have opted in to see prototypes/beta stuff, 0-100% gradual phased rollout.

(tags: cd deploy etsy slides migrations database schema ops ci version-control feature-flags)
Etsy's Release Management process

Good info on how Etsy use their Deployinator tool, end-to-end. Slide 11: git SHA is visible for each env, allowing easy verification of what code is deployed. Slide 14: Code is deployed to "princess" staging env while CI tests are running; no need to wait for unit/CI tests to complete. Slide 23: smoke tests of pre-prod "princess" (complete after 8 mins elapsed). Slide 31: dashboard link for deployed code is posted during deploy; post-release prod smoke tests are run by Jenkins. (short ones! they complete in 42 seconds)

(tags: deployment etsy deploy deployinator princess staging ops testing devops smoke-tests production jenkins)
Makerbot’s Saddest Hour | TechCrunch

I’ve been speaking to a few people [at Makerbot] who prefer to remain anonymous and most of my contacts there are gone (the head of PR was apparently fired) and don’t want to talk. But the new from inside is troubling. The mass-layoffs are blamed on low revenue and one former employee wrote “Company was failing. Couldn’t pay vendors, had to downsize.” Do I think Makerbot will sink? At this point I don’t know.

(tags: makerbot 3d-printing startups downsizing layoffs ouch)
credstash

'CredStash is a very simple, easy to use credential management and distribution system that uses AWS Key Management System (KMS) for key wrapping and master-key storage, and DynamoDB for credential storage and sharing.'

(tags: aws credstash python security keys key-management secrets kms)
ferd.ca -> Lessons Learned while Working on Large-Scale Server Software

Good advice

(tags: distributed scalability systems coding server-side erlang devops networking reliability)

Links for 2015-04-21

Published April 21, 2015

Internet Scale Services Checklist

good aspirational checklist, inspired heavily by James Hamilton's seminal 2007 paper, "On Designing And Deploying Internet-Scale Services"

(tags: james-hamilton checklists ops internet-scale architecture operability monitoring reliability availability uptime aspirations)

Links for 2015-04-20

Published April 20, 2015

FBI admits flaws in hair analysis over decades

Wow, this is staggering.
The Justice Department and FBI have formally acknowledged that nearly every examiner in an elite FBI forensic unit gave flawed testimony in almost all trials in which they offered evidence against criminal defendants over more than a two-decade period before 2000. [....] The review confirmed that FBI experts systematically testified to the near-certainty of “matches” of crime-scene hairs to defendants, backing their claims by citing incomplete or misleading statistics drawn from their case work. In reality, there is no accepted research on how often hair from different people may appear the same. Since 2000, the lab has used visual hair comparison to rule out someone as a possible source of hair or in combination with more accurate DNA testing. Warnings about the problem have been mounting. In 2002, the FBI reported that its own DNA testing found that examiners reported false hair matches more than 11 percent of the time.

(tags: fbi false-positives hair dna biometrics trials justice experts crime forensics inaccuracy csi)
The missing MtGox bitcoins

Most or all of the missing bitcoins were stolen straight out of the MtGox hot wallet over time, beginning in late 2011. As a result, MtGox operated at fractional reserve for years (knowingly or not), and was practically depleted of bitcoins by 2013. A significant number of stolen bitcoins were deposited onto various exchanges, including MtGox itself, and probably sold for cash (which at the bitcoin prices of the day would have been substantially less than the hundreds of millions of dollars they were worth at the time of MtGox's collapse). MtGox' bitcoins continuously went missing over time, but at a decreasing pace. Again by the middle of 2013, the curve goes more or less flat, matching the hypothesis that by that time there may not have been any more bitcoins left to lose. The rate of loss otherwise seems unusually smooth and at the same time not strictly relative to any readily available factors such as remaining BTC holdings, transaction volumes or the BTC price. Worth pointing out is that, thanks to having matched up most of the deposit/withdrawal log earlier, we can at this point at least rule out the possibility of any large-scale fake deposits — the bitcoins going into MtGox were real, meaning the discrepancy was likely rather caused by bitcoins leaving MtGox without going through valid withdrawals.

(tags: mtgox bitcoin security fail currency theft crime btc)
Bank of the Underworld - The Atlantic

Prosecutors analyzed approximately 500 of Liberty Reserve’s biggest accounts, which constituted 44 percent of its business. The government contends that 32 of these accounts were connected to the sale of stolen credit cards and 117 were used by Ponzi-scheme operators. All of this activity flourished, prosecutors said, because Liberty Reserve made no real effort to monitor its users for criminal behavior. What’s more, records showed that one of the company’s top tech experts, Mark Marmilev, who was also arrested, appeared to have promoted Liberty Reserve in chat rooms devoted to Ponzi schemes.
(via Nelson)

(tags: scams fraud crime currency the-atlantic liberty-reserve ponzi-schemes costa-rica arthur-budovsky banking anonymity cryptocurrency money-laundering carding)
I was a Lampedusa refugee. Here’s my story of fleeing Libya – and surviving

'The boy next to me fell to the floor and for a moment I didn’t know if he had fainted or was dead – then I saw that he was covering his eyes so he didn’t have to see the waves any more. A pregnant woman vomited and started screaming. Below deck, people were shouting that they couldn’t breathe, so the men in charge of the boat went down and started beating them. By the time we saw a rescue helicopter, two days after our boat had left Libya with 250 passengers on board, some people were already dead – flung into the sea by the waves, or suffocated downstairs in the dark.'

(tags: lampedusa migration asylum europe fortress-europe italy politics immigration libya refugees)
Run your own high-end cloud gaming service on EC2

Using Steam streaming and EC2 g2.2xlarge spot instances -- 'comes out to around $0.52/hr'. That's pretty compelling IMO

(tags: aws ec2 gaming games graphics spot-instances hacks windows steam)
Running Arbitrary Executables in AWS Lambda

actually an officially-supported mode. huh

(tags: lambda aws architecture ops node.js javascript unix linux)

Links for 2015-04-18

Published April 18, 2015

Exclusive: Chopra says ECB's threats to Ireland were 'outrageous' - Independent.ie

The letters urged the then-government to commit to structural reforms and restructuring of the financial sector. "That is not their job," Mr Chopra said. "Their mandate is to meet inflation. And if you lecture the ECB as to how they might go about that, they talk about their independence. "But when it comes to lecturing others about fiscal policy or structural policy, they're not at all hesitant. I'm not surprised that the people in Ireland were very upset about these letters from [Jean-Claude] Trichet."

(tags: trichet banking ireland politics ajai-chopra ecb history)
Writing Minecraft Plugins - The Book

wow, Walter Higgins' book (from Peachpit Press) is looking great

(tags: books reading minecraft walter-higgins javascript)
Pinball

Pinterest's Hadoop workflow manager; 'scalable, reliable, simple, extensible' apparently. Hopefully it allows upgrades of a workflow component without breaking an existing run in progress, like LinkedIn's Azkaban does :(

(tags: python pinterest hadoop workflows ops pinball big-data scheduling)

Links for 2015-04-17

Published April 17, 2015

HACKERS COULD COMMANDEER NEW PLANES THROUGH PASSENGER WI-FI

Boeing 787 Dreamliner jets, as well as Airbus A350 and A380 aircraft, have Wi-Fi passenger networks that use the same network as the avionics systems of the planes
What the fucking fuck. Air-gap or gtfo

(tags: air-gap security planes boeing a380 a350 dreamliner networking firewalls avionics)
Tips for debugging EC2 Container Service

some basic ECS tips from Gilt

(tags: gilt ecs tips ops advice ec2 aws)
_Blade: a Data Center Garbage Collector_

Essentially, add a central GC scheduler to improve tail latencies in a cluster, by taking instances out of the pool to perform slow GC activity instead of letting them impact live operations. I've been toying with this idea for a while, nice to see a solid paper about it

(tags: gc latency tail-latencies papers blade go java scheduling clustering load-balancing low-latency performance)
SCADA systems online, and a horror story about a non-airgapped Boeing 747 engine management system

747's are big flying Unix hosts. At the time, the engine management system on this particular airline was Solaris based. The patching was well behind and they used telnet as SSH broke the menus and the budget did not extend to fixing this. The engineers could actually access the engine management system of a 747 in route. If issues are noted, they can re-tune the engine in air. The issue here is that all that separated the engine control systems and the open network was NAT based filters. There were (and as far as I know this is true today), no extrusion controls. They filter incoming traffic, but all outgoing traffic is allowed.
(via Paddy Benson)

(tags: air-gap planes boeing security 747 solaris unix)
Squarespace

Nice, simple "build a website" platform. Keeping this one bookmarked for the next time someone non-techie asks me for the simplest way to do just that (thanks for the tip, Oisin)

(tags: via:oisin blog cms design hosting web-design web websites)

Links for 2015-04-16

Published April 16, 2015

Extracting Structured Data From Recipes Using Conditional Random Fields

nice probabilistic/ML approach to recipe parsing

(tags: nytimes recipes parsing text nlp machine-learning probabilistic crf++ algorithms feature-extraction)
Large-scale cluster management at Google with Borg

Google's Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines. It achieves high utilization by combining admission control, efficient task-packing, over-commitment, and machine sharing with process-level performance isolation. It supports high-availability applications with runtime features that minimize fault-recovery time, and scheduling policies that reduce the probability of correlated failures. Borg simplifies life for its users by offering a declarative job specification language, name service integration, real-time job monitoring, and tools to analyze and simulate system behavior. We present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it.
(via Conall)

(tags: via:conall clustering google papers scale to-read borg cluster-management deployment packing reliability redundancy)
Keeping Your Car Safe From Electronic Thieves - NYTimes.com

In a normal scenario, when you walk up to a car with a keyless entry and try the door handle, the car wirelessly calls out for your key so you don’t have to press any buttons to get inside. If the key calls back, the door unlocks. But the keyless system is capable of searching for a key only within a couple of feet. Mr. Danev said that when the teenage girl turned on her device, it amplified the distance that the car can search, which then allowed my car to talk to my key, which happened to be sitting about 50 feet away, on the kitchen counter. And just like that, open sesame.
What the hell -- who designed a system that would auto-unlock based on signal strength alone?!!

(tags: security fail cars keys signal proximity keyless-entry prius toyota crime amplification power-amplifiers 3db keyless)
Closed access means people die

'We've paid 100 BILLION USD over the last 10 years to "publish" science and medicine. Ebola is a massive systems failure.' See also https://www.techdirt.com/articles/20150409/17514230608/dont-think-open-access-is-important-it-might-have-prevented-much-ebola-outbreak.shtml : 'The conventional wisdom among public health authorities is that the Ebola virus, which killed at least 10,000 people in Liberia, Sierra Leone and Guinea, was a new phenomenon, not seen in West Africa before 2013. [...] But, as the team discovered, that "conventional wisdom" was wrong. In fact, they found a bunch of studies, buried behind research paywalls, that revealed that there was significant evidence of antibodies to the Ebola virus in Liberia and in other nearby nations. There was one from 1982 that noted: "medical personnel in Liberian health centers should be aware of the possibility that they may come across active cases and thus be prepared to avoid nosocomial epidemics."

(tags: deaths liberia ebola open-access papers elsevier science medicine reprints)
Making Pinterest — Learn to stop using shiny new things and love MySQL

'The third reason people go for shiny is because older tech isn’t advertised as aggressively as newer tech. The younger companies needs to differentiate from the old guard and be bolder, more passionate and promise to fulfill your wildest dreams. But most new tech sales pitches aren’t generally forthright about their many failure modes. In our early days, we fell into this third trap. We had a lot of growing pains as we scaled the architecture. The most vocal and excited database companies kept coming to us saying they’d solve all of our scalability problems. But nobody told us of the virtues of MySQL, probably because MySQL just works, and people know about it.' It's true! -- I'm still a happy MySQL user for some use cases, particularly read-mostly relational configuration data...

(tags: mysql storage databases reliability pinterest architecture)
Microservices and elastic resource pools with Amazon EC2 Container Service

interesting approach to working around ECS' shortcomings -- bit specific to Hailo's microservices arch and IPC mechanism though. aside: I like their version numbering scheme: ISO-8601, YYYYMMDDHHMMSS. keep it simple!

(tags: versioning microservices hailo aws ec2 ecs docker containers scheduling allocation deployment provisioning qos)
Please Kill Me (Eventually) | Motherboard

There is much that the wise application of technology can do to help us ease off this mortal coil, instead of tormenting ourselves at the natural end of life in a futile, undignified and excruciating attempt to keep it somehow duct-taped on. Train more people in geriatrics, for example. Learn new ways to make life safe, healthy, fun and interesting for the old. Think like a community, a brotherhood, not like atomized competing individuals a few of whom can somehow "beat the system" of the universe. Maybe it is better to examine clearly what we are with a view to understanding and acceptance than it is to try to escape what perhaps should be our inevitable ending.

(tags: death mortality cryogenics alcor geriatrics life singularity mind-uploading ray-kurzweil)
CGA in 1024 Colors - a New Mode: the Illustrated Guide

awesome hackery. brings me back to my C=64 demo days

(tags: pc cga graphics hacks art 1024-colours)

Links for 2015-04-15

Published April 15, 2015

Keywhiz

'a secret management and distribution service [from Square] that is now available for everyone. Keywhiz helps us with infrastructure secrets, including TLS certificates and keys, GPG keyrings, symmetric keys, database credentials, API tokens, and SSH keys for external services — and even some non-secrets like TLS trust stores. Automation with Keywhiz allows us to seamlessly distribute and generate the necessary secrets for our services, which provides a consistent and secure environment, and ultimately helps us ship faster. [...] Keywhiz has been extremely useful to Square. It’s supported both widespread internal use of cryptography and a dynamic microservice architecture. Initially, Keywhiz use decoupled many amalgamations of configuration from secret content, which made secrets more secure and configuration more accessible. Over time, improvements have led to engineers not even realizing Keywhiz is there. It just works. Please check it out.'

(tags: square security ops keys pki key-distribution key-rotation fuse linux deployment secrets keywhiz)

Links for 2015-04-14

Published April 14, 2015

Bigcommerce Status Page blasts IBM Softlayer Object Storage service

This is pretty heavy stuff:
Bigcommerce engineers have been very pro-active in working with our storage provider, IBM Softlayer, in finding solutions. Unfortunately, it takes two parties to come to a solution. In this case, IBM Softlayer intentionally let their Object Storage cluster fall into disrepair and chose not to scale it. This has impacted Bigcommerce, IBM and many other Softlayer customers. Our engineers placed too much trust in IBM Softlayer and that's on us. However, the catastrophic failures to see metrics and rapidly scale capacity, the decisions to let hard drives sit at 90% utilization for weeks and months, the cascading failures of an undersized cluster of 52 nodes for the busiest data center in their business speaks to IBM Softlayer’s lack of concern for their customers. We found this out 3 days ago.
(via Oisin)

(tags: softlayer bigcommerce outages shambles ibm fail object-storage storage iaas cloud)
Subscribing AWS Lambda Function To SNS Topic With aws-cli

how to use the AWS command line tools to do this

(tags: aws aws-cli cli lambda sns hacks)
Yelp Product & Engineering Blog | True Zero Downtime HAProxy Reloads

Using tc and qdisc to delay SYNs while haproxy restarts. Definitely feels like on-host NAT between 2 haproxy processes would be cleaner and easier though!

(tags: linux networking hacks yelp haproxy uptime reliability tcp tc qdisc ops)

Links for 2015-04-13

Published April 13, 2015

Amazon Machine Learning

Upsides of this new AWS service: * great UI and visualisations. * solid choice of metric to evaluate the results. Maybe things moved on since I was working on it, but the use of AUC, false positives and false negatives was pretty new when I was working on it. (er, 10 years ago!) Downsides: * it could do with more support for unsupervised learning algorithms. Supervised learning means you need to provide training data, which in itself can be hard work. My experience with logistic regression in the past is that it requires very accurate training data, too -- its tolerance for misclassified training examples is poor. * Also, in my experience, 80% of the hard work of using ML algorithms is writing good tokenisation and feature extraction algorithms. I don't see any help for that here unfortunately. (probably not that surprising as it requires really detailed knowledge of the input data to know what classes can be abbreviated into a single class, etc.)

(tags: amazon aws ml machine-learning auc data-science)
Rob Pike's 5 rules of optimization

these are great. I've run into rule #3 ("fancy algorithms are slow when n is small, and n is usually small") several times...

(tags: twitter rob-pike via:igrigorik coding rules laws optimization performance algorithms data-structures aphorisms)
AWS Lambda Event-Driven Architecture With Amazon SNS

Any message posted to an SNS topic can trigger the execution of custom code you have written, but you don’t have to maintain any infrastructure to keep that code available to listen for those events and you don’t have to pay for any infrastructure when the code is not being run. This is, in my opinion, the first time that Amazon can truly say that AWS Lambda is event-driven, as we now have a central, independent, event management system (SNS) where any authorized entity can trigger the event (post a message to a topic) and any authorized AWS Lambda function can listen for the event, and neither has to know about the other.

(tags: aws ec2 lambda sns events cep event-processing coding cloud hacks eric-hammond)
Texting at the wheel kills more US teenagers every year than drink-driving

Texting while behind the wheel has overtaken drink driving as the biggest cause of death among teenagers in America. More than 3,000 teenagers are killed every year in car crashes caused by texting while driving compared to 2,700 from drink driving. The study by Cohen Children’s Medical Center also discovered that 50 per cent of students admit to texting while driving.

(tags: texting sms us driving car-safety safety drink-driving)
China’s Great Cannon

Conducting such a widespread attack clearly demonstrates the weaponization of the Chinese Internet to co-opt arbitrary computers across the web and outside of China to achieve China’s policy ends. The repurposing of the devices of unwitting users in foreign jurisdictions for covert attacks in the interests of one country’s national priorities is a dangerous precedent — contrary to international norms and in violation of widespread domestic laws prohibiting the unauthorized use of computing and networked systems.

(tags: censorship ddos internet security china great-cannon citizen-lab reports web)
Sirius: An open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers

How to build an Intelligent Personal Assistant: 'Sirius is an open end-to-end standalone speech and vision based intelligent personal assistant (IPA) similar to Apple’s Siri, Google’s Google Now, Microsoft’s Cortana, and Amazon’s Echo. Sirius implements the core functionalities of an IPA including speech recognition, image matching, natural language processing and a question-and-answer system. Sirius is developed by Clarity Lab at the University of Michigan. Sirius is published at the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2015.'

(tags: sirius siri cortana google-now echo ok-google ipa assistants search video audio speech papers clarity nlp wikipedia)
Why We Will Not Be Registering easyDNS.SUCKS - blog.easydns.org

If you're not immersed in the naming business you may find the jargon in it hard to understand. The basic upshot is this: the IPC believes that the mechanisms that were enacted to protect trademark holders during the deluge of new TLD rollouts are being gamed by the .SUCKS TLD operator to extort inflated fees from trademark holders.
(via Nelson)

(tags: shakedown business internet domains dns easydns dot-sucks scams tlds trademarks ip)