Skip to content

Category: Uncategorized

Links for 2014-10-07

Links for 2014-10-06

  • Reddit’s crappy ultimatum to remote workers and offices

    Reddit forces all remote workers (about half the workforce, in SLC and NYC) to move to SF, provoking a shitstorm:

    In a tweet confirming the move, Reddit’s CEO justified his treatment of non-San Francisco workers with a push for Optimal Teamwork to drive the New And $50M Improved Reddit forward. I shit you not. That was the actual term! (I added the New & Improved fan fiction here). So let’s leave aside the debate over whether working remotely is as efficient as being in the same office all the time. Let’s just focus on the size of the middle finger given to the people who work at Reddit outside the Bay Area, given the choice of forced, express relocation or a pink slip. How optimal do you think these employees will feel about leadership and the rest of the team going forward? Do you think they’ll just show up at the new, apparently-not-even-in-San-Francisco-proper office with a smile from ear to ear, ready to begin in earnest on Optimal Teamwork, left-behind former colleagues be damned?

    (tags: telecommuting reddit working remote-working ceos optimal-teamwork teamwork relocation)

  • Space Jacket

    'I designed this jacket as a tribute to the continuing legacy of American spaceflight. I wanted it to embody everything I loved about the space program, and to eventually serve as an actual flight jacket for present-day astronauts on missions to the ISS (International Space Station). There are other “replica” flight jackets made for space enthusiasts, but I decided to come up with something boldly different, yet also completely wearable and well-suited for space.'

    (tags: space clothing fashion geekery jackets)

  • How did Twitter become the hate speech wing of the free speech party?

    Kevin Marks has a pretty good point here:

    Your tweet could win the fame lottery, and everyone on the Internet who thinks you are wrong could tell you about it. Or one of the "verified" could call you out to be the tribute for your community and fight in their Hunger Games. Say something about feminism, or race, or sea lions and you'd find yourself inundated by the same trite responses from multitudes. Complain about it, and they turn nasty, abusing you, calling in their friends to join in. Your phone becomes useless under the weight of notifications; you can't see your friends support amongst the flood. The limited tools available - blocking, muting, going private - do not match well with these floods. Twitter's abuse reporting form takes far longer than a tweet, and is explicitly ignored if friends try to help.

    (tags: harassment twitter 4chan abuse feminism hate-speech gamergate sea-lions filtering social-media kevin-marks)

  • Mnesia and CAP

    A common “trick” is to claim: 'We assume network partitions can’t happen. Therefore, our system is CA according to the CAP theorem.' This is a nice little twist. By asserting network partitions cannot happen, you just made your system into one which is not distributed. Hence the CAP theorem doesn’t even apply to your case and anything can happen. Your system may be linearizable. Your system might have good availability. But the CAP theorem doesn’t apply. [...] In fact, any well-behaved system will be “CA” as long as there are no partitions. This makes the statement of a system being “CA” very weak, because it doesn’t put honesty first. I tries to avoid the hard question, which is how the system operates under failure. By assuming no network partitions, you assume perfect information knowledge in a distributed system. This isn’t the physical reality.

    (tags: cap erlang mnesia databases storage distcomp reliability ca postgres partitions)

  • Integrating Kafka and Spark Streaming: Code Examples and State of the Game

    Spark Streaming has been getting some attention lately as a real-time data processing tool, often mentioned alongside Apache Storm. [...] I added an example Spark Streaming application to kafka-storm-starter that demonstrates how to read from Kafka and write to Kafka, using Avro as the data format and Twitter Bijection for handling the data serialization. In this post I will explain this Spark Streaming example in further detail and also shed some light on the current state of Kafka integration in Spark Streaming. All this with the disclaimer that this happens to be my first experiment with Spark Streaming.

    (tags: spark kafka realtime architecture queues avro bijection batch-processing)

  • Mandos

    'a system for allowing servers with encrypted root file systems to reboot unattended and/or remotely.' (via Tony Finch)

    (tags: via:fanf mandos encryption security server ops sysadmin linux)

  • Zonify

    'a set of command line tools for managing Route53 DNS for an AWS infrastructure. It intelligently uses tags and other metadata to automatically create the associated DNS records.'

    (tags: zonify aws dns ec2 route53 ops)

  • Mike Perham on Twitter: "Sweet, monit just sent a DMCA takedown notice to @github to remove Inspeqtor."

    'The work, Inspeqtor which is hosted at GitHub, is far from a “clean-room” implementation. This is basically a rewrite of Monit in Go, even using the same configuration language that is used in Monit, verbatim. a. [private] himself admits that Inspeqtor is "heavily influenced“ by Monit https://github.com/mperham/inspeqtor/wiki/Other-Solutions. b. This tweet by [private] demonstrate intent. https://twitter.com/mperham/status/452160352940064768 "OSS nerds: redesign and build monit in Go. Sell it commercially. Make $$$$. I will be your first customer.”' IANAL, but using the same config language does not demonstrate copyright infringement...

    (tags: copyright dmca tildeslash monit inspeqtor github ops oss agpl)

  • YOU AND YOUR DAMNED GAMES, JON STONE — Why bother with #gamergate?

    So what is #gamergate? #gamergate is a mob with torches aloft, hunting for any combustible dwelling and calling it a monster’s lair. #gamergate is a rage train, and everyone with an axe to grind wants a ride. Its fuel is a sour mash of entitlement, insecurity, arrogance and alienation. #gamergate is a vindication quest for political intolerance. #gamergate is revenge for every imagined slight. #gamergate is Viz’s Meddlesome Ratbag.

    (tags: gamergate culture gaming 4chan mobs feminism)

Links for 2014-10-05

  • 'In 1976 I discovered Ebola, now I fear an unimaginable tragedy' | World news | The Observer

    An interview with the scientist who was part of the team which discovered the Ebola virus in 1976:

    Other samples from the nun, who had since died, arrived from Kinshasa. When we were just about able to begin examining the virus under an electron microscope, the World Health Organisation instructed us to send all of our samples to a high-security lab in England. But my boss at the time wanted to bring our work to conclusion no matter what. He grabbed a vial containing virus material to examine it, but his hand was shaking and he dropped it on a colleague's foot. The vial shattered. My only thought was: "Oh, shit!" We immediately disinfected everything, and luckily our colleague was wearing thick leather shoes. Nothing happened to any of us.

    (tags: ebola epidemiology health africa labs history medicine)

Links for 2014-10-04

  • Ebola vaccine delayed by IP spat

    This is the downside of publicly-funded labs selling patent-licensing rights to private companies:

    Given the urgency, it's inexplicable that one of the candidate vaccines, developed at the Public Health Agency of Canada (PHAC) in Winnipeg, has yet to go in the first volunteer's arm, says virologist Heinz Feldmann, who helped develop the vaccine while at PHAC. "It’s a farce; these doses are lying around there while people are dying in Africa,” says Feldmann, who now works at the Rocky Mountain Laboratories of the U.S. National Institute of Allergy and Infectious Diseases (NIAID) in Hamilton, Montana. At the center of the controversy is NewLink Genetics, a small company in Ames, Iowa, that bought a license to the vaccine's commercialization from the Canadian government in 2010, and is now suddenly caught up in what WHO calls "the most severe acute public health emergency seen in modern times.” Becker and others say the company has been dragging its feet the past 2 months because it is worried about losing control over the development of the vaccine.

    (tags: ip patents drugs ebola canada phac newlink-genetics health epidemics vaccines)

  • sferik/t

    "A command-line power tool for Twitter." It really is -- much better timeline searchability than the "real" Twitter UI, for example

    (tags: twitter ruby github cli tools unix search)

  • Ebola: While Big Pharma Slept

    We’ve had almost 40 years to develop, test and stockpile an Ebola vaccine. That has not happened because big pharma has been entirely focused on shareholder value and profits over safety and survival from a deadly virus. For the better part of Ebola’s 38 years ? big pharma has been asleep. The question ahead is what virus or superbug will wake them up?

    (tags: pharma ebola ip patents health drugs africa research)

Links for 2014-10-02

Links for 2014-10-01

Links for 2014-09-29

  • Prototype

    Prototype is a brand new festival of play and interaction. This is your chance to experience the world from a new perspective with removable camera eyes, to jostle and joust to a Bach soundtrack whilst trying to disarm an opponent, to throw shapes as you figure out who got an invite to the silent disco, to duel with foam pool noodles, and play chase in the dark with flashlights. A unique festival that incites new types of social interaction, involving technology and the city, Prototype is a series of performances, workshops, talks, and games that spill across the city, alongside an adult playground in the heart of Temple Bar.
    Project Arts Centre, 17-18 October. looks nifty

    (tags: prototype festivals dublin technology make vr gaming)

  • Confessions of a former internet troll - Vox

    I want to tell you about when violent campaigns against harmless bloggers weren't any halfway decent troll's idea of a good time — even the then-malicious would've found it too easy to be fun. When the punches went up, not down. Before the best players quit or went criminal or were changed by too long a time being angry. When there was cruelty, yes, and palpable strains of sexism and racism and every kind of phobia, sure, but when these things had the character of adolescents pushing the boundaries of cheap shock, disagreeable like that but not criminal. Not because that time was defensible — it wasn't, not really — but because it was calmer and the rage wasn't there yet. Because trolling still meant getting a rise for a laugh, not making helpless people fear for their lives because they're threatening some Redditor's self-proclaimed monopoly on reason. I want to tell you about it because I want to make sense of how it is now and why it changed.

    (tags: vox trolls blogging gamergate 4chan weev history teenagers)

Links for 2014-09-27

  • La Maison des Amis

    Paul Hickey's gite near Toulouse, available for rent! 'a beautifully converted barn on 5 acres, wonderfully located in the French countryside. 4 Bedrooms, sleeps 2-10, Large Pool, Tennis Court, Large Trampoline, Broadband Internet, 30 Mins Toulouse/Albi, 65 Mins Carcassonne, 90 Mins Rodez'

    (tags: ex-iona gites france holidays vacation-rentals vacation hotels)

  • waxpancake on Ello

    The Ello founders are positioning it as an alternative to other social networks — they won't sell your data or show you ads. "You are not the product." If they were independently-funded and run as some sort of co-op, bootstrapped until profitable, maybe that's plausible. Hard, but possible. But VCs don't give money out of goodwill, and taking VC funding — even seed funding — creates outside pressures that shape the inevitable direction of a company.

    (tags: advertising money vc ello waxy funding series-a)

  • Inviso: Visualizing Hadoop Performance

    With the increasing size and complexity of Hadoop deployments, being able to locate and understand performance is key to running an efficient platform.  Inviso provides a convenient view of the inner workings of jobs and platform.  By simply overlaying a new view on existing infrastructure, Inviso can operate inside any Hadoop environment with a small footprint and provide easy access and insight.  
    This sounds pretty useful.

    (tags: inviso netflix hadoop emr performance ops tools)

  • The End of Linux

    'Linux is becoming the thing that we adopted Linux to get away from.' Great post on the horrible complexity of systemd. It reminds me of nothing more than mid-90s AIX, which I had the displeasure of opsing for a while -- the Linux distros have taken a very wrong turn here.

    (tags: linux unix complexity compatibility ops rant systemd bloat aix)

Links for 2014-09-24

Links for 2014-09-23

  • Avoiding Chef-Suck with Auto Scaling Groups - forty9ten

    Some common problems which arise using Chef with ASGs in EC2, and how these guys avoided it -- they stopped using Chef for service provisioning, and instead baked AMIs when a new version was released. ASGs using pre-baked AMIs definitely works well so this makes good sense IMO.

    (tags: infrastructure chef ops asg auto-scaling ec2 provisioning deployment)

  • Introducing Groups.io

    Mark "ONEList" Fletcher's back, and he's reinventing the email group! awesome.

    email groups (the modern version of mailing lists) have stagnated over the past decade. Yahoo Groups and Google Groups both exude the dank air of benign neglect. Google Groups hasn’t been updated in years, and some of Yahoo’s recent changes have actually made Yahoo Groups worse! And yet, millions of people put up with this uncertainty and neglect, because email groups are still one of the best ways to communicate with groups of people. And I have a plan to make them even better. So today I’m launching Groups.io in beta, to bring email groups into the 21st Century. At launch, we have many features that those other services don’t have, including: Integration with other services, including: Github, Google Hangouts, Dropbox, Instagram, Facebook Pages, and the ability to import Feeds into your groups. Businesses and organizations can have their own private groups on their own subdomain. Better archive organization, using hashtags. Many more email delivery options. The ability to mute threads or hashtags. Fully searchable archives, including searching within attachments. One other feature that Groups.io has that Yahoo and Google don’t, is a business model that’s not based on showing ads to you. Public groups are completely free on Groups.io. Private groups and organizations are very reasonably priced.

    (tags: email groups communication discussion mailing-lists groups.io yahoo google google-groups yahoo-groups)

Links for 2014-09-22

  • The Open Source Software Engagement Award

    SFU announces award for students who demonstrate excellence in contributing to an Open Source project

    (tags: sfu awards students open-source oss universities funding)

  • DublinDashboard

    'provides citizens, public sector workers and companies with real-time information, time-series indicator data, and interactive maps about all aspects of the city. It enables users to gain detailed, up to date intelligence about the city that aids everyday decision making and fosters evidence-informed analysis.'

    (tags: dublin dashboards maps geodata time-series open-data ireland)

  • mcrouter: A memcached protocol router for scaling memcached deployments

    New from Facebook engineering:

    Last year, at the Data@Scale event and at the USENIX Networked Systems Design and Implementation conference , we spoke about turning caches into distributed systems using software we developed called mcrouter (pronounced “mick-router”). Mcrouter is a memcached protocol router that is used at Facebook to handle all traffic to, from, and between thousands of cache servers across dozens of clusters distributed in our data centers around the world. It is proven at massive scale — at peak, mcrouter handles close to 5 billion requests per second. Mcrouter was also proven to work as a standalone binary in an Amazon Web Services setup when Instagram used it last year before fully transitioning to Facebook's infrastructure. Today, we are excited to announce that we are releasing mcrouter’s code under an open-source BSD license. We believe it will help many sites scale more easily by leveraging Facebook’s knowledge about large-scale systems in an easy-to-understand and easy-to-deploy package.
    This is pretty crazy -- basically turns a memcached cluster into a much more usable clustered-storage system, with features like shadowing production traffic, cold cache warmup, online reconfiguration, automatic failover, prefix-based routing, replicated pools, etc. Lots of good features.

    (tags: facebook scaling cache proxy memcache open-source clustering distcomp storage)

  • DIRECT MARKETING - A GENERAL GUIDE FOR DATA CONTROLLERS

    In particular:

    Where you have obtained contact details in the context of the sale of a product or service, you may only use these details for direct marketing by electronic mail if the following conditions are met: the product or service you are marketing is of a kind similar to that which you sold to the customer at the time you obtained their contact details At the time you collected the details, you gave the customer the opportunity to object, in an easy manner and without charge, to their use for marketing purposes Each time you send a marketing message, you give the customer the right to object to receipt of further messages The sale of the product or service occurred not more than twelve months prior to the sending of the electronic marketing communication or, where applicable, the contact details were used for the sending of an electronic marketing communication in that twelve month period.

    (tags: email spam regulations ireland law dpc marketing direct-marketing)

Links for 2014-09-21

Links for 2014-09-19

Links for 2014-09-18

  • 75% of domestic violence victims in US shelters were spied on by their abusers using spyware

    via Mikko

  • Alex Payne — Thoughts On Five Years of Emerging Languages

    One could read the success of Go as an indictment of contemporary PLT, but I prefer to see it as a reminder of just how much language tooling matters. Perhaps even more critical, Go’s lean syntax, selective semantics, and cautiously-chosen feature set demonstrate the importance of a strong editorial voice in a language’s design and evolution. Having co-authored a book on Scala, it’s been painful to see systems programmers in my community express frustration with the ambitious hybrid language. I’ve watched them abandon ship and swim back to the familiar shores of Java, or alternately into the uncharted waters of Clojure, Go, and Rust. A pity, but not entirely surprising if we’re being honest with ourselves. Unlike Go, Scala has struggled with tooling from its inception. More than that, Scala has had a growing editorial problem. Every shop I know that’s been successful with Scala has limited itself to some subset of the language. Meanwhile, in pursuit of enterprise developers, its surface area has expanded in seemingly every direction. The folks behind Scala have, thankfully, taken notice: upcoming releases are promised to focus on simplicity, clarity, and better tooling.

    (tags: scala go coding languages)

Links for 2014-09-17

  • Texas Judge References 'The Big Lebowski'

    "The First Amendment of the U.S. Constitution is similarly suspicious of prior restraints," wrote Justice Lehrmann in the decision highlighting a cornerstone that has "been reaffirmed time and again by the Supreme Court, this Court, Texas courts of appeals, legal treatises, and even popular culture." That last reference to popular culture contained an interesting footnote citing none other than Walter Sobchak, a character in ['The Big Lebowski'].

    (tags: lebowski movies coen-brothers prior-restraint law supreme-court walter-sobchak funny)

  • on using JSON as a config file format

    Ben Hughes on twitter: "JSON is fine for config files, if you don't want to comment your config file. Which is a way of saying, it isn't fine for config files."

    (tags: ben-hughes funny json file-formats config-files configuration software coding)

  • Understanding weak isolation is a serious problem

    Peter Bailis complaining about the horrors of modern transactional databases and their unserializability, which noone seems to be paying attention to: 'As you’re probably aware, there’s an ongoing and often lively debate between transactional adherents and more recent “NoSQL” upstarts about related issues of usability, data corruption, and performance. But, in contrast, many of these transactional inherents and the research community as a whole have effectively ignored weak isolation — even in a single server setting and despite the fact that literally millions of businesses today depend on weak isolation and that many of these isolation levels have been around for almost three decades.' 'Despite the ubiquity of weak isolation, I haven’t found a database architect, researcher, or user who’s been able to offer an explanation of when, and, probably more importantly, why isolation models such as Read Committed are sufficient for correct execution. It’s reasonably well known that these weak isolation models represent “ACID in practice,” but I don’t think we have any real understanding of how so many applications are seemingly (!?) okay running under them. (If you haven’t seen these models before, they’re a little weird. For example, Read Committed isolation generally prevents users from reading uncommitted or non-final writes but allows a number of bad things to happen, like lost updates during concurrent read-modify-write operations. Why is this apparently okay for many applications?)'

    (tags: acid consistency databases peter-bailis transactional corruption serializability isolation reliability)

  • "Left-Right: A Concurrency Control Technique with Wait-Free Population Oblivious Reads" [pdf]

    'In this paper, we describe a generic concurrency control technique with Blocking write operations and Wait-Free Population Oblivious read operations, which we named the Left-Right technique. It is of particular interest for real-time applications with dedicated Reader threads, due to its wait-free property that gives strong latency guarantees and, in addition, there is no need for automatic Garbage Collection. The Left-Right pattern can be applied to any data structure, allowing concurrent access to it similarly to a Reader-Writer lock, but in a non-blocking manner for reads. We present several variations of the Left-Right technique, with different versioning mechanisms and state machines. In addition, we constructed an optimistic approach that can reduce synchronization for reads.' See also http://concurrencyfreaks.blogspot.ie/2013/12/left-right-concurrency-control.html for java implementation code.

    (tags: left-right concurrency multithreading wait-free blocking realtime gc latency reader-writer locking synchronization java)

  • Russell91/sshrc

    'bring your .bashrc, .vimrc, etc. with you when you ssh'. A really nice implementation of this idea (much nicer than my own version!)

    (tags: hacks productivity ssh remote shell sh bash via:johnke home-directory unix)

  • Troubleshooting Production JVMs with jcmd

    remotely trigger GCs, finalization, heap dumps etc. Handy

    (tags: jvm jcmd debugging ops java gc heap troubleshooting)

Links for 2014-09-16

Links for 2014-09-15

  • The State of ZFS on Linux

    Linux users familiar with other filesystems or ZFS users from other platforms will often ask whether ZFS on Linux (ZoL) is “stable”. The short answer is yes, depending on your definition of stable. The term stable itself is somewhat ambiguous.
    Oh dear. that's not a good start. Good reference page, though

    (tags: zfs linux filesystems ops solaris)

  • Screen time: Steve Jobs was a low tech parent

    “This is rule No. 1: There are no screens in the bedroom. Period. Ever.”

    (tags: screen-time kids children tv mobile technology life rules parenting)

  • CausalImpact: A new open-source package for estimating causal effects in time series

    How can we measure the number of additional clicks or sales that an AdWords campaign generated? How can we estimate the impact of a new feature on app downloads? How do we compare the effectiveness of publicity across countries? In principle, all of these questions can be answered through causal inference. In practice, estimating a causal effect accurately is hard, especially when a randomised experiment is not available. One approach we've been developing at Google is based on Bayesian structural time-series models. We use these models to construct a synthetic control — what would have happened to our outcome metric in the absence of the intervention. This approach makes it possible to estimate the causal effect that can be attributed to the intervention, as well as its evolution over time. We've been testing and applying structural time-series models for some time at Google. For example, we've used them to better understand the effectiveness of advertising campaigns and work out their return on investment. We've also applied the models to settings where a randomised experiment was available, to check how similar our effect estimates would have been without an experimental control. Today, we're excited to announce the release of CausalImpact, an open-source R package that makes causal analyses simple and fast. With its release, all of our advertisers and users will be able to use the same powerful methods for estimating causal effects that we've been using ourselves. Our main motivation behind creating the package has been to find a better way of measuring the impact of ad campaigns on outcomes. However, the CausalImpact package could be used for many other applications involving causal inference. Examples include problems found in economics, epidemiology, or the political and social sciences.

    (tags: causal-inference r google time-series models bayes adwords advertising statistics estimation metrics)

  • Top 10 Historic Sites in Ireland and Northern Ireland -- National Geographic

    Shamefully, I haven't visited most of these!

    (tags: history neolithic ireland northern-ireland national-geographic tourism places)

  • Software patents are crumbling, thanks to the Supreme Court

    Now a series of decisions from lower courts is starting to bring the ruling's practical consequences into focus. And the results have been ugly for fans of software patents. By my count there have been 11 court rulings on the patentability of software since the Supreme Court's decision — including six that were decided this month.  Every single one of them has led to the patent being invalidated. This doesn't necessarily mean that all software patents are in danger — these are mostly patents that are particularly vulnerable to challenge under the new Alice precedent. But it does mean that the pendulum of patent law is now clearly swinging in an anti-patent direction. Every time a patent gets invalidated, it strengthens the bargaining position of every defendant facing a lawsuit from a patent troll.

    (tags: patents law alice swpats software supreme-court patent-trolls)

  • Riding with the Stars: Passenger Privacy in the NYC Taxicab Dataset

    A practical demo of "differential privacy" -- allowing public data dumps to happen without leaking privacy, using Laplace noise addition

    (tags: differential-privacy privacy leaks public-data open-data data nyc taxis laplace noise randomness)

Links for 2014-09-14

  • Platform Game

    I'm ambivalent about Microsoft acquiring Mojang. Will they Embrace and Extend Minecraft as they've done with other categories? Let's hope not. On the other hand, some adult supervision and a Plugin API would be welcome. Mojang have the financial resources but lack the will and focus needed to publish and support a Plugin API. Perhaps Mojang themselves don't realise just how important their little game has become.

    (tags: minecraft platforms games plugins mojang microsoft)

Links for 2014-09-11

Links for 2014-09-10

  • Apple: Untrustable

    Today, Apple announced their “Most Personal Device Ever”. They also announced Apple Pay (the only mentions of “security” and “privacy” in today’s event), and are rolling out health tracking and home automation in iOS 8. Given their feckless track record [with cloud-service security], would you really trust Apple with (even more of) your digital life?

    (tags: icloud apple fail security hacks privacy)

  • Not Safe For Not Working On

    Excellent post from Dan Kaminsky on concrete actions that cloud service providers like Apple and Google need to start taking.

    *It's time to ban Password1*: [...] Defenders are using simple rules like “doesn’t have an uppercase letter” and “not enough punctuation” to block passwords while attackers are just straight up analyzing password dumps and figuring out the most likely passwords to attempt in any scenario.  Attackers are just way ahead.  That has to change.  Defenders have password dumps too now.  It’s time we start outright blocking passwords common enough that they can be online brute forced, and it’s time we admit we know what they are. [...] *People use communication technologies for sexy times. Deal with it*: Just like browsers have porn mode for the personal consumption of private imagery, cell phones have applications that are significantly less likely to lead to anyone else but your special friends seeing your special bits. I personally advise Wickr, an instant messaging firm that develops secure software for iPhone and Android. What’s important about Wickr here isn’t just the deep crypto they’ve implemented, though it’s useful too. What’s important in this context is that with this code there’s just a lot fewer places to steal your data from. Photos and other content sent in Wickr don’t get backed up to your desktop, don’t get saved in any cloud, and by default get removed from your friend’s phone after an amount of time you control. Wickr is of course not the only company supporting what’s called “ephemeral messaging”; SnapChat also dramatically reduces the exposure of your private imagery. [...]
    via Leonard.

    (tags: icloud apple privacy security via:lhl snapchat wickr dan-kaminsky cloud-services backup)

  • Inside Apple’s Live Event Stream Failure, And Why It Happened: It Wasn’t A Capacity Issue

    The bottom line with this event is that the encoding, translation, JavaScript code, the video player, the call to S3 single storage location and the millisecond refreshes all didn’t work properly together and was the root cause of Apple’s failed attempt to make the live stream work without any problems. So while it would be easy to say it was a CDN capacity issue, which was my initial thought considering how many events are taking place today and this week, it does not appear that a lack of capacity played any part in the event not working properly. Apple simply didn’t provision and plan for the event properly.

    (tags: cdn streaming apple fail scaling s3 akamai caching)

Links for 2014-09-09

Links for 2014-09-08

Links for 2014-09-06

  • 'The very first release of Gmail simply used spamassassin on the backend'

    Excellent. Confirming what I'd heard from a few other sources, too ;) This is a well-written history of the anti-spam war so far, from Mike Hearn, writing with the Google/Gmail point of view:

    Brief note about my background, to establish credentials: I worked at Google for about 7.5 years. For about 4.5 of those I worked on the Gmail abuse team, which is very tightly linked with the spam team (they use the same software, share the same on-call rotations etc).
    Reading this kind of stuff is awesome for me, since it's a nice picture of a fun problem to work on -- the Gmail team took the right ideas about how to fight spam, and scaled them up to the 10s-of-millions DAU mark. Nicely done. The second half is some interesting musings on end-to-end encrypted communications and how it would deal with spam. Worth a read...

    (tags: gmail google spam anti-spam filtering spamassassin history)

  • The FBI Finally Says How It ‘Legally’ Pinpointed Silk Road’s Server

    The answer, according to a new filing by the case’s prosecution, is far more mundane: The FBI claims to have found the server’s location without the NSA’s help, simply by fiddling with the Silk Road’s login page until it leaked its true location.

    (tags: fbi nsa silk-road tor opsec dread-pirate-roberts wired)

Links for 2014-09-05

Links for 2014-09-04

  • Visualizing Garbage Collection Algorithms

    Great dataviz with animated GIFs

    (tags: algorithms gc memory visualization garbage-collection dataviz refcounting mark-and-sweep)

  • Standard Markdown

    John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously. In the absence of a spec, early implementers consulted the original Markdown.pl code to resolve these ambiguities. But Markdown.pl was quite buggy, and gave manifestly bad results in many cases, so it was not a satisfactory replacement for a spec. Because there is no unambiguous spec, implementations have diverged considerably. As a result, users are often surprised to find that a document that renders one way on one system (say, a GitHub wiki) renders differently on another (say, converting to docbook using Pandoc). To make matters worse, because nothing in Markdown counts as a “syntax error,” the divergence often isn't discovered right away. There's no standard test suite for Markdown; the unofficial MDTest is the closest thing we have. The only way to resolve Markdown ambiguities and inconsistencies is Babelmark, which compares the output of 20+ implementations of Markdown against each other to see if a consensus emerges. We propose a standard, unambiguous syntax specification for Markdown, along with a suite of comprehensive tests to validate Markdown implementations against this specification. We believe this is necessary, even essential, for the future of Markdown.

    (tags: writing markdown specs standards text formats html)

  • Postcodes at last but random numbers don’t address efficiency

    Karlin Lillington assembles a fine collection of quotes from various sources panning the new Eircode system:

    Critics say the opportunity has been missed to use Ireland’s clean-slate status to produce a technologically innovative postcode system that would be at the cutting edge globally; similar to the competitive leap that was provided when the State switched to a digital phone network in the 1980s, well ahead of most of the world. Instead, say organisations such as the Freight Transport Association of Ireland (FTAI), the proposed seven-digit format of scrambled letters and numbers is almost useless for a business sector that should most benefit from a proper postcode system: transport and delivery companies, from international giants like FedEx and UPS down to local courier, delivery and service supplier firms. Because each postcode will reveal the exact address of a home or business, privacy advocates are concerned that online use of postcodes could link many types of internet activity, including potentially sensitive online searches, to a specific household or business.

    (tags: eircode government fail ireland postcodes location ftai random)

Links for 2014-09-03

Links for 2014-09-02

  • Nix: The Purely Functional Package Manager

    'a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. It provides atomic upgrades and rollbacks, side-by-side installation of multiple versions of a package, multi-user package management and easy setup of build environments. ' Basically, this is a third-party open source reimplementation of Amazon's (excellent) internal packaging system, using symlinks to versioned package directories to ensure atomicity and the ability to roll back. This is definitely the *right* way to build packages -- I know what tool I'll be pushing for, next time this question comes up. See also nixos.org for a Linux distro built on Nix.

    (tags: ops linux devops unix packaging distros nix nixos atomic upgrades rollback versioning)

Links for 2014-09-01

  • Facebook's drop-in replacement for std::vector

    Fixes some low-hanging fruit, performance-wise. 'Simply replacing std::vector with folly::fbvector (after having included the folly/FBVector.h header file) will improve the performance of your C++ code using vectors with common coding patterns. The improvements are always non-negative, almost always measurable, frequently significant, sometimes dramatic, and occasionally spectacular.' (via Tony Finch)

    (tags: c++ facebook performance algorithms vectors via:fanf optimization)

  • Applying cardiac alarm management techniques to your on-call

    An ops-focused take on a recent story about alarm fatigue, and how a Boston hospital dealt with it. When I was in Amazon, many of the teams in our division had a target to reduce false positive pages, with a definite monetary value attached to it, since many teams had "time off in lieu" payments for out-of-hours pages to the on-call staff. As a result, reducing false-positive pages was reasonably high priority and we dealt with this problem very proactively, with a well-developed sense of how to do so. It's interesting to see how the outside world is only just starting to look into its amelioration. (Another benefit of a TOIL policy ;)

    (tags: ops monitoring sysadmin alerts alarms nagios alarm-fatigue false-positives pages)

  • "Invertible Bloom Lookup Tables" [paper]

    'We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of key-value pairs, but also allows a complete listing of the pairs it contains with high probability, as long the number of key- value pairs is below a designed threshold. Our structure allows the number of key-value pairs to greatly exceed this threshold during normal operation. Exceeding the threshold simply temporarily prevents content listing and reduces the probability of a successful lookup. If entries are later deleted to return the structure below the threshold, everything again functions appropriately. We also show that simple variations of our structure are robust to certain standard errors, such as the deletion of a key without a corresponding insertion or the insertion of two distinct values for a key. The properties of our structure make it suitable for several applications, including database and networking applications that we highlight.'

    (tags: iblt bloom-filters data-structures performance algorithms coding papers probabilistic)

  • Some UX Dark Patterns now illegal in the EU

    The EU’s new consumer rights law bans certain dark patterns related to e-commerce across Europe. The “sneak into basket” pattern is now illegal. Full stop, end of story. You cannot create a situation where additional items and services are added by default. [...] Hidden costs are now illegal, whether that’s an undeclared subscription, extra shipping charges, or extra items. [....] Forced continuity, when imposed on the user as a form of bait-and-switch, has been banned. Just the other day a web designer mentioned to me that he had only just discovered he had been charged for four years of annual membership dues in a “theme club”, having bought what he thought was a one-off theme. Since he lives in Europe, he may be able to claim all of this money back. All he needs to do is prove that the website did not inform him that the purchase included a membership with recurring payments.

    (tags: design europe law ecommerce ux dark-patterns scams ryanair selling online consumer consumer-rights bait-and-switch)

  • Girl Not Against Fluoride

    The CDC (Centre for Disease Control) lists water fluoridation as one of the ten great public health achievements of the 20th Century. Today, Dublin City Council will vote on whether to remove fluoride from our water supply, and when they do, it will not be because the CDC or the WHO have changed their mind about fluoridation, or because new and compelling information makes it the only choice. It will be because people who believe in angel healing, homeopathy, and chemtrails, have somehow gained the ability to influence public policy.

    (tags: dcc dublin law flouride science zenbuffy homeopathy woo health teeth)

  • Revisiting How We Put Together Linux Systems

    Building a running OS out of layered btrfs filesystems. This sounds awesome.

    Instantiating a new system or OS container (which is exactly the same in this scheme) just consists of creating a new appropriately named root sub-volume. Completely naturally you can share one vendor OS copy in one specific version with a multitude of container instances. Everything is double-buffered (or actually, n-fold-buffered), because usr, runtime, framework, app sub-volumes can exist in multiple versions. Of course, by default the execution logic should always pick the newest release of each sub-volume, but it is up to the user keep multiple versions around, and possibly execute older versions, if he desires to do so. In fact, like on ChromeOS this could even be handled automatically: if a system fails to boot with a newer snapshot, the boot loader can automatically revert back to an older version of the OS.
    (via Tony Finch)

    (tags: via:fanf linux docker btrfs filesystems unionfs copy-on-write os hacking unix)

Links for 2014-08-29

Links for 2014-08-28

Links for 2014-08-27

  • Apache Kafka 0.8 basic training

    This is a pretty voluminous and authoritative presentation about getting started with Kafka; wish this was around when we started using it for 0.7. (We use our own homegrown realtime system nowadays, due to better partitioning, monitoring and operability.)

    (tags: storm kafka presentations documentation ops)

  • Wiki Loves Monuments

    Wiki Loves Monuments is an international photo contest, organised by Wikimedia [...]. This year, the Wikimedia Ireland Community are running the competition for the very first time in Ireland. The contest is inspired by the successful 2010 pilot in the Netherlands which resulted in 12,500 freely licensed images uploaded to Wikimedia Commons. It has grown substantially since its inception; in 2013 369,589 photographs were submitted by 11,943 participants from over 50 countries. Cultural heritage is an important part of the knowledge that Wikipedia collects and disseminates. An image is worth a thousand words, in any language and local enthusiasts can (re)discover the cultural, historical, or scientific significance of their neighbourhood. The Irish contest, focussing on Ireland’s national monuments, runs from August 23 - September 30. Follow our step-by-step guide to find out how you can take part.

    (tags: wikipedia wikimedia images monuments history ireland contests creative-commons licensing)

  • "CryptoPhone" claims to detect IMSI catchers in operation

    To show what the CryptoPhone can do that less expensive competitors cannot, he points me to a map that he and his customers have created, indicating 17 different phony cell towers known as “interceptors,” detected by the CryptoPhone 500 around the United States during the month of July alone.  Interceptors look to a typical phone like an ordinary tower.  Once the phone connects with the interceptor, a variety of “over-the-air” attacks become possible, from eavesdropping on calls and texts to pushing spyware to the device. “Interceptor use in the U.S. is much higher than people had anticipated,” Goldsmith says.  “One of our customers took a road trip from Florida to North Carolina and he found 8 different interceptors on that trip.  We even found one at South Point Casino in Las Vegas.”

    (tags: imsi-catchers security cryptophone phones mobile 3g 4g eavesdropping surveillance)

Links for 2014-08-25

Links for 2014-08-22

Links for 2014-08-21

Links for 2014-08-19

  • Nyms Identity Directory

    The way that [problems with the PGP bootstrapping] are supposed to be resolved is with an authentication model called the Web of Trust where users sign keys of other users after verifying that they are who they say they are. In theory, if some due diligence is applied in signing other people’s keys and a sufficient number of people participate you’ll be able to follow a short chain of signatures from people you already know and trust to new untrusted keys you download from a key server. In practice this has never worked out very well as it burdens users with the task of manually finding people to sign their keys and even experts find the Web of Trust model difficult to reason about. This also reveals the social graph of certain communities which may place users at risk for their associations. Such signatures also reveal metadata about times and thus places for meetings for key signings. The Nyms Identity Directory is a replacement for all of this. Keyservers are replaced with an identity directory that gives users full control over publication of their key information and web of trust is replaced with a distributed network of trusted notaries which validate user keys with an email verification protocol.

    (tags: web-of-trust directories nyms privacy crypto identity trust pgp gpg security via:ioerror keyservers notaries)

  • Frogsort

    Frogsort as an exam question (via qwghlm)

    (tags: via:qwghlm frogsort sorting big-o algorithms funny comics smbc)

  • Punished for Being Poor: Big Data in the Justice System

    This is awful. Totally the wrong tool for the job -- a false positive rate which is miniscule for something like spam filtering, could translate to a really horrible outcome for a human life.

    Currently, over 20 states use data-crunching risk-assessment programs for sentencing decisions, usually consisting of proprietary software whose exact methods are unknown, to determine which individuals are most likely to re-offend. The Senate and House are also considering similar tools for federal sentencing. These data programs look at a variety of factors, many of them relatively static, like criminal and employment history, age, gender, education, finances, family background, and residence. Indiana, for example, uses the LSI-R, the legality of which was upheld by the state’s supreme court in 2010. Other states use a model called COMPAS, which uses many of the same variables as LSI-R and even includes high school grades. Others are currently considering the practice as a way to reduce the number of inmates and ensure public safety. (Many more states use or endorse similar assessments when sentencing sex offenders, and the programs have been used in parole hearings for years.) Even the American Law Institute has embraced the practice, adding it to the Model Penal Code, attesting to the tool’s legitimacy.
    (via stroan)

    (tags: via:stroan statistics false-positives big-data law law-enforcement penal-code risk sentencing)

Links for 2014-08-18

  • Microservices - Not a free lunch! - High Scalability

    Some good reasons not to adopt microservices blindly. Testability and distributed-systems complexity are my biggest fears

    (tags: microservices soa devops architecture testing distcomp)

  • Richard Clayton - Failing at Microservices

    Solid warts-and-all confessional blogpost about a team failing to implement a microservices architecture. I'd put most of the blame on insufficient infrastructure to support them (at a code level), inter-personal team problems, and inexperience with large-scale complex multi-service production deployment and the work it was going to require

    (tags: microservices devops collaboration architecture fail team deployment soa)

  • Box Tech Blog » A Tale of Postmortems

    How Box introduced COE-style dev/ops outage postmortems, and got them working. This PIE metric sounds really useful to head off the dreaded "it'll all have to come out missus" action item:

    The picture was getting clearer, and we decided to look into individual postmortems and action items and see what was missing. As it was, action items were wasting away with no owners. Digging deeper, we noticed that many action items entailed massive refactorings or vague requirements like “make system X better” (i.e. tasks that realistically were unlikely to be addressed). At a higher level, postmortem discussions often devolved into theoretical debates without a clear outcome. We needed a way to lower and focus the postmortem bar and a better way to categorize our action items and our technical debt. Out of this need, PIE (“Probability of recurrence * Impact of recurrence * Ease of addressing”) was born. By ranking each factor from 1 (“low”) to 5 (“high”), PIE provided us with two critical improvements: 1. A way to police our postmortems discussions. I.e. a low probability, low impact, hard to implement solution was unlikely to get prioritized and was better suited to a discussion outside the context of the postmortem. Using this ranking helped deflect almost all theoretical discussions. 2. A straightforward way to prioritize our action items. What’s better is that once we embraced PIE, we also applied it to existing tech debt work. This was critical because we could now prioritize postmortem action items alongside existing work. Postmortem action items became part of normal operations just like any other high-priority work.

    (tags: postmortems action-items outages ops devops pie metrics ranking refactoring prioritisation tech-debt)

  • NTP's days are numbered for consumer devices

    An accurate clock is required to negotiate SSL/TLS, so clock sync is important for internet-of-things usage. but:

    Unfortunately for us, the traditional and most widespread method for clock synchronisation (NTP) has been caught up in a DDoS issue which has recently caused some ISPs to start blocking all NTP communication. [....] Because the DDoS attacks are so widespread, and the lack of obvious commercial pressure to fix the issue, it’s possible that the days of using NTP as a mechanism for setting clocks may well be numbered. Luckily for us there is a small but growing project that replaces it. tlsdate was started by Jacob Appelbaum of the Tor project in 2012, making use of the SSL handshake in order to extract time from a remote server, and its usage is on the rise. [....] Since we started encountering these problems, we’ve incorporated tlsdate into an over-the-air update, and have successfully started using this in situations where NTP is blocked.

    (tags: tlsdate ntp clocks time sync iot via:gwire ddos isps internet protocols security)

  • Cloudwash – Creating the Technical Prototype

    This is a lovely demo of integrating modern IoT connectivity functionality (remote app control, etc.) with a washing machine using Bergcloud's hardware and backend, and a little logic-analyzer reverse engineering.

    (tags: arduino diy washing-machines iot bergcloud hacking reversing logic-analyzers hardware)

  • Systemd: Harbinger of the Linux apocalypse

    While there are many defensible aspects of Systemd, other aspects boggle the mind. Not the least of these was that, as of a few months ago, trying to debug the kernel from the boot line would cause the system to crash. This was because of Systemd's voracious logging and the fact that Systemd responds to the "debug" flag on the kernel boot line -- a flag meant for the kernel, not anything else. That, straight up, is a bug. However, the Systemd developers didn't see it that way and actively fought with those experiencing the problem. Add the fact that one of the Systemd developers was banned by Linus Torvalds for poor attitude and bad design and another was responsible for causing significant issues with Linux audio support, but blamed the problem on everything else but his software, and you have a bad situation on your hands. There's no shortage of egos in the open source development world. There's no shortage of new ideas and veteran developers and administrators pooh-poohing something new simply because it's new. But there are also 45 years of history behind Unix and extremely good reasons it's still flourishing. Tools designed like Systemd do not fit the Linux mold, to their own detriment. Systemd's design has more in common with Windows than with Unix -- down to the binary logging.
    The link re systemd consuming the "debug" kernel boot arg is a canonical example of inflexible coders refusing to fix their own bugs. (via Jason Dixon)

    (tags: systemd linux red-hat egos linus-torvalds unix init booting debugging logging design software via:obfuscurity)

  • Inside a Chinese Bitcoin Mine

    The mining operation resides on an old, repurposed factory floor, and contains 2500 machines hashing away at 230 Gh/s, each. (That’s 230 billion calculations per second, per unit). [...] The operators told me that the power bill of this specific operation is in excess of ¥400,000 per month [..] about $60,000 USD.

    (tags: currency china economics bitcoin power environment green mining datacenters)

  • Moving Big Data into the Cloud with Tsunami UDP - AWS Big Data Blog

    Pretty serious speedup. 81 MB/sec with Tsunami UDP, compared to 9 MB/sec with plain old scp. Probably kills internet performance for everyone else though!

    (tags: tsunami-udp udp scp copying transfers internet long-distance performance speed)

  • The "sidecar" pattern

    Ha, great name. We use this (in the form of Smartstack).

    For what it is worth, we faced a similar challenge in earlier services (mostly due to existing C/C++ applications) and we created what was called a "sidecar".  By sidecar, what I mean is a second process on each node/instance that did Cloud Service Fabric operations on behalf of the main process (the side-managed process).  Unfortunately those sidecars all went off and created one-offs for their particular service.  In this post, I'll describe a more general sidecar that doesn't force users to have these one-offs. Sidenote:  For those not familiar with sidecars, think of the motorcycle sidecar below.  Snoopy would be the main process with Woodstock being the sidecar process.  The main work on the instance would be the motorcycle (say serving your users' REST requests).  The operational control is the sidecar (say serving health checks and management plane requests of the operational platform).

    (tags: netflix sidecars architecture patterns smartstack netflixoss microservices soa)

Links for 2014-08-15

Links for 2014-08-14

Links for 2014-08-13

Links for 2014-08-12

Links for 2014-08-10

Links for 2014-08-09

Links for 2014-08-08

  • AWS Speed Test: What are the Fastest EC2 and S3 Regions?

    My god, this test is awful -- this is how NOT to test networked infrastructure. (1) testing from a single EC2 instance in each region; (2) uploading to a single test bucket for each test; (3) results don't include min/max or percentiles, just an averaged measurement for each test. FAIL

    (tags: fail testing networking performance ec2 aws s3 internet)

  • Hacker Redirects Traffic From 19 Internet Providers to Steal Bitcoins | Threat Level | WIRED

    'The attacker specifically targeted a collection of bitcoin mining “pools”–bitcoin-producing cooperatives in which users contribute their computers’ processing power and are rewarded with a cut of the resulting cryptocurrency the pool produces. The redirection technique tricked the pools’ participants into continuing to devote their processors to bitcoin mining while allowing the hacker to keep the proceeds. At its peak, according to the researchers’ measurements, the hacker’s scam was pocketing a flow of bitcoins and other digital currencies including dogecoin and worldcoin worth close to $9,000 a day. “With this kind of hijacking, you can quite easily grab a large collection of clients,” says Pat Litke, one of the Dell researchers. “It takes less than a minute, and you end up with a lot of mining traffic under your control.”' 'In total, Stewart and Litke were able to measure $83,000 worth of cryptocurrency stolen in the BGP attack [...] but the total haul could be larger'

    (tags: bitcoin mining fraud internet bgp routing security attacks hacking)

Links for 2014-08-07

Links for 2014-08-06

Links for 2014-08-05

Links for 2014-08-01

Links for 2014-07-31

  • UK private copying exception plans face possible legal action

    Under the proposed private copying exception, individuals in the UK would be given a new right to make a copy of copyrighted material they have lawfully and permanently acquired for their private use, provided it was not for commercial ends. Making a private copy of the material in these circumstances would not be an act of copyright infringement, although making a private copy of a computer program would still be prohibited under the plans. There is no mechanism envisaged in the draft legislation for rights holders to be specifically compensated for the act of private copying. This prompted the Joint Committee on Statutory Instruments (JCSI), tasked with scrutinising the proposals, to warn parliamentarians that the rules may be deemed to be in breach of EU copyright laws as a result of the lack of 'fair compensation' mechanism. [...] "We are disappointed that the private copying exception will be introduced without providing fair compensation for British songwriters, performers and other rights holders within the creative sector. A mechanism for fair compensation is a requirement of European law. In response we are considering our legal options," [UK Music] said.

    (tags: uk law copyright music copying private-copying personal infringement piracy transcoding backup)

  • Moominvalley Map Print | Magic Pony

    Lovely print! Shipping would be a bit crazy, though. There has to be an english-language print of one of Tove Jansson's maps on sale somewhere in Europe...

    (tags: prints moomins moominvalley maps hattifatteners magic-pony tove-jannson art)

Links for 2014-07-29

  • How to take over the computer of any JVM developer

    To prove how easy [MITM attacking Mavencentral JARs] is to do, I wrote dilettante, a man-in-the-middle proxy that intercepts JARs from maven central and injects malicious code into them. Proxying HTTP traffic through dilettante will backdoor any JARs downloaded from maven central. The backdoored version will retain their functionality, but display a nice message to the user when they use the library.

    (tags: jars dependencies java build clojure security mitm http proxies backdoors scala maven gradle)

  • Spain pushes for 'Google tax' to restrict linking

    The government wants to put a tax on linking on the internet. They say that if you want to link to some newspaper's content, you have to pay a tax. The primary targets of this law are Google News and other aggregators. It would be absurd enough just like that, but the law goes further: they declared it an "inalienable right" so even if I have a blog or a new small digital media publication and I want to let people freely link to my content, I can't opt-out--they are charging the levy, and giving it to the big press media. It was just the last and only way that the old traditional media companies can get some money from the government, and they strongly lobbied for it. The bill has passed in the Congress where the party in the government has majority (PP, Partido Popular) and it's headed to the Senate, where they have a majority also.

    (tags: spain stupidity law via:boingboing linking links web news google google-news newspapers old-media taxes)

  • Keyes New Starter Kit for Arduino Fans

    $53 for a reasonable-looking Arduino starter kit, from DealExtreme. cheap cheap! In the inimitable DX style:

    Keyes new beginner starter kit, pay more attention to beginners learning. Users can get rid of the difficult technological learning, from module used to quick start production.

    (tags: learning arduino hardware hacking robotics toys dealextreme tobuy)

Links for 2014-07-28

  • Check If A Hotel’s WiFi Sucks Before It’s Too Late

    http://www.hotelwifitest.com/ and http://speedspot.org/ .

    (tags: wifi hotels travel reviews techcrunch internet)

  • Collection Pipeline

    a nice summarisation of the state of pipe/stream-oriented collection operations in various languages, from Martin Fowler

    (tags: martin-fowler patterns coding ruby clojure streams pipelines pipes unix lambda fp java languages)

  • REST Commander: Scalable Web Server Management and Monitoring

    We dynamically monitor and manage a large and rapidly growing number of web servers deployed on our infrastructure and systems. However, existing tools present major challenges when making REST/SOAP calls with server-specific requests to a large number of web servers, and then performing aggregated analysis on the responses. We therefore developed REST Commander, a parallel asynchronous HTTP client as a service to monitor and manage web servers. REST Commander on a single server can send requests to thousands of servers with response aggregation in a matter of seconds. And yes, it is open-sourced at http://www.restcommander.com. Feature highlights: Click-to-run with zero installation; Generic HTTP request template supporting variable-based replacement for sending server-specific requests; Ability to send the same request to different servers, different requests to different servers, and different requests to the same server; Maximum concurrency control (throttling) to accommodate server capacity; Commander itself is also “as a service”: with its powerful REST API, you can define ad-hoc target servers, an HTTP request template, variable replacement, and a regular expression all in a single call. In addition, intuitive step-by-step wizards help you achieve the same functionality through a GUI.

    (tags: rest http clients load-testing ebay soap async testing monitoring)

  • South Downs litter picker has truck named after him - West Sussex County Times

    This is amazing. In http://www.newyorker.com/magazine/2014/06/30/stepping-out-3 , David Sedaris had written: 'in recognition of all the rubbish I’ve collected since getting my Fitbit, my local council is naming a garbage truck after me'; naturally, I assumed he was joking, but it looks like he wasn't:

    Horsham District Council has paid thanks to a volunteer who devotes a great deal of time and energy to walking many miles clearing litter from near where he lives as well as surrounding areas. David Sedaris litter picks in areas including Parham, Coldwaltham, Storrington and beyond. In recognition for all his fantastic work and dedication and as a token of Horsham District Council’s appreciation, the council has named one of their waste vehicles after him. The vehicle, bedecked with its bespoke ‘Pig Pen Sedaris’ sign was officially unveiled by the Lord-Lieutenant of West Sussex Mrs Susan Pyper at an outdoor ceremony on July 23.
    Best of all, the article utterly fails to mention who he is. Amazing. (via John Braine)

    (tags: via:john-braine funny david-sedaris litter uk horsham rubbish garbage cleaning volunteering walking)

Links for 2014-07-24

Links for 2014-07-23

  • This tree produces 40 different types of fruit

    An art professor from Syracuse University in the US, Van Aken grew up on a family farm before pursuing a career as an artist, and has combined his knowledge of the two to develop his incredible Tree of 40 Fruit.  In 2008, Van Aken learned that an orchard at the New York State Agricultural Experiment Station was about to be shut down due to a lack of funding. This single orchard grew a great number of heirloom, antique, and native varieties of stone fruit, and some of these were 150 to 200 years old. To lose this orchard would render many of these rare and old varieties of fruit extinct, so to preserve them, Van Aken bought the orchard, and spent the following years figuring out how to graft parts of the trees onto a single fruit tree. [...] Aken’s Tree of 40 Fruit looks like a normal tree for most of the year, but in spring it reveals a stunning patchwork of pink, white, red and purple blossoms, which turn into an array of plums, peaches, apricots, nectarines, cherries and almonds during the summer months, all of which are rare and unique varieties. 

    (tags: fruit art amazing food agriculture grafting orchards sam-van-aken farming)

Links for 2014-07-22

  • Metrics-Driven Development

    we believe MDD is equal parts engineering technique and cultural process. It separates the notion of monitoring from its traditional position of exclusivity as an operations thing and places it more appropriately next to its peers as an engineering process. Provided access to real-time production metrics relevant to them individually, both software engineers and operations engineers can validate hypotheses, assess problems, implement solutions, and improve future designs.
    Broken down into the following principles: 'Instrumentation-as-Code', 'Single Source of Truth', 'Developers Curate Visualizations and Alerts', 'Alert on What You See', 'Show me the Graph', 'Don’t Measure Everything (YAGNI)'. We do all of these at Swrve, naturally (a technique I happily stole from Amazon).

    (tags: metrics coding graphite mdd instrumentation yagni alerting monitoring graphs)

  • Auto Scale DynamoDB With Dynamic DynamoDB

    Nicely-packaged auto-scaler for DynamoDB

    (tags: dynamodb autoscaling scalability provisioning aws ec2 cloudformation)

Links for 2014-07-21

Links for 2014-07-18

Links for 2014-07-16

Links for 2014-07-15

Links for 2014-07-14

Links for 2014-07-11

  • Netflix/ribbon

    a client side IPC library that is battle-tested in cloud. It provides the following features: Load balancing; Fault tolerance; Multiple protocol (HTTP, TCP, UDP) support in an asynchronous and reactive model; Caching and batching.
    I like the integration of Eureka and Hystrix in particular, although I would really like to read more about Eureka's approach to availability during network partitions and CAP. https://groups.google.com/d/msg/eureka_netflix/LXKWoD14RFY/-5nElGl1OQ0J has some interesting discussion on the topic. It actually sounds like the Eureka approach is more correct than using ZK: 'Eureka is available. ZooKeeper, while tolerant against single node failures, doesn't react well to long partitioning events. For us, it's vastly more important that we maintain an available registry than a necessary consistent registry. If us-east-1d sees 23 nodes, and us-east-1c sees 22 nodes for a little bit, that's OK with us.' See also http://ispyker.blogspot.ie/2013/12/zookeeper-as-cloud-native-service.html which corroborates this:
    I went into one of the instances and quickly did an iptables DROP on all packets coming from the other two instances. This would simulate an availability zone continuing to function, but that zone losing network connectivity to the other availability zones. What I saw was that the two other instances noticed that the first server “going away”, but they continued to function as they still saw a majority (66%). More interestingly the first instance noticed the other two servers “going away” dropping the ensemble availability to 33%. This caused the first server to stop serving requests to clients (not only writes, but also reads). [...] To me this seems like a concern, as network partitions should be considered an event that should be survived. In this case (with this specific configuration of zookeeper) no new clients in that availability zone would be able to register themselves with consumers within the same availability zone. Adding more zookeeper instances to the ensemble wouldn’t help considering a balanced deployment as in this case the availability would always be majority (66%) and non-majority (33%).

    (tags: netflix ribbon availability libraries java hystrix eureka aws ec2 load-balancing networking http tcp architecture clients ipc)

  • The Myth of Schema-less [NoSQL]

    We don't seem to gain much in terms of database flexibility. Is our application more flexible? I don't think so. Even without our schema explicitly defined in our database, it's there... somewhere. You simply have to search through hundreds of thousands of lines to find all the little bits of it. It has the potential to be in several places, making it harder to properly identify. The reality of these codebases is that they are error prone and rarely lack the necessary documentation. This problem is magnified when there are multiple codebases talking to the same database. This is not an uncommon practice for reporting or analytical purposes. Finally, all this "flexibility" rears its head in the same way that PHP and Javascript's "neat" weak typing stabs you right in the face. There are some somethings you can be cavalier about, and some things you should be strict about. Your data model is one you absolutely need to be strict on. If a field should store an int, it should store nothing else. Not a string, not a picture of a horse, but an integer. It's nice to know that I have my database doing type checking for me and I can expect a field to be the same type across all records. All this leads us to an undeniable fact: There is always a schema. Wearing "I don't do schema" as a badge of honor is a complete joke and encourages a terrible development practice.

    (tags: nosql databases storage schema strong-typing)

  • Latest EBS tuning tips

    from yesterday's AWS Summit in NYC:

    Cheat sheet of EBS-optimized instances. http://t.co/vmTlhUtpWk Optimize your queue depth to achieve lower latency & highest IOPS. http://t.co/EO48oa0D6X When configuring your RAID, use a stripe size of 128KB or 256KB. http://t.co/N0ldtFJ4t6 Use larger block size to speed up the pre-warming process. http://t.co/8UoIeWE2px

    (tags: ebs aws amazon iops raid ops tuning)

Links for 2014-07-10

Links for 2014-07-09

  • Google's Influential Papers for 2013

    Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google. Below are some of the especially influential papers co-authored by Googlers in 2013.

    (tags: google papers toread reading 2013 scalability machine-learning algorithms)

Links for 2014-07-08

  • #BPjMleak

    'Leak of the secret German Internet Censorship URL blacklist BPjM-Modul'. Turns out there's a blocklist of adult-only or prohibited domains issued by a German government department, The Federal Department for Media Harmful to Young Persons (German: "Bundesprüfstelle für jugendgefährdende Medien" or BPjM), issued in the form of a list of hashes of those domains. These were extracted from an AVM router, then the hashes were brute forced using several other plaintext URL blocklists and domain lists. Needless to say, there's an assortment of silly false positives, such as the listing of the website for the 1997 3D Realms game "Shadow Warrior": http://en.wikipedia.org/wiki/Shadow_Warrior

    (tags: hashes reversing reverse-engineering germany german bpjm filtering blocklists blacklists avm domains censorship fps)

  • Brave Men Take Paternity Leave - Gretchen Gavett - Harvard Business Review

    The use of paternity leave has a "snowball effect":

    In the end, Dahl says, “coworkers and brothers who were linked to a father who had his child immediately after the [Norwegian paid paternity leave] reform — versus immediately before the reform — were 3.5% and 4.7% more likely, respectively, to take parental leave.” But when a coworker actually takes parental leave, “the next coworker to have a child at his workplace is 11% more likely to take paternity leave.” Slightly more pronounced, the next brother to have a child is 15% more likely to take time off. And while any male coworker taking leave can reduce stigma, the effect of a manager doing so is more profound. Specifically, “the estimated peer effect is over two and a half times larger if the peer father is predicted to be a manager in the firm as opposed to a regular coworker.”

    (tags: paternity-leave parenting leave work norway research)

  • "The Tail at Scale"

    by Jeffrey Dean and Luiz Andre Barroso, Google. A selection of Google's architectural mechanisms used to defeat 99th-percentile latency spikes: hedged requests, tied requests, micro-partitioning, selective replication, latency-induced probation, canary requests.

    (tags: google architecture distcomp soa http partitioning replication latency 99th-percentile canary-requests hedged-requests)

Links for 2014-07-07

Links for 2014-07-06

  • Layered Glass Table Concept Creates a Cross-Section of the Ocean

    beautiful stuff -- and a snip at only UKP 5,800 ex VAT. it'd make a good DIY project though ;)

    (tags: art tables glass layering 3d cross-sections water ocean sea mapping cartography layers this-is-colossal design furniture)

  • Two traps in iostat: %util and svctm

    Marc Brooker:

    As a measure of general IO busyness %util is fairly handy, but as an indication of how much the system is doing compared to what it can do, it's terrible. Iostat's svctm has even fewer redeeming strengths. It's just extremely misleading for most modern storage systems and workloads. Both of these fields are likely to mislead more than inform on modern SSD-based storage systems, and their use should be treated with extreme care.

    (tags: ioutil iostat svctm ops ssd disks hardware metrics stats linux)

  • New AWS Web Services region: eu-central-1 (soon)

    Iiiinteresting. Sounds like new anti-NSA-snooping privacy laws will be driving a lot of new mini-regions in AWS. Hope Amazon have their new-region-standup process a little more streamlined by now than when I was there ;)

    (tags: aws germany privacy ec2 eu-central-1 nsa snooping)

  • How A Spam Newsletter Caused a Bank Run in Bulgaria

    According to the Bulgarian National Security Agency (see here, for a reporting in English), an investment company that “built a network of associated companies for marketing services” that was used to diffuse panic by means of an alert, uncomfortably titled “Information Bulletin of on the Risk of Deposits in Bulgarian Banks”. The “bulletin” claimed – Bloomberg reports – KTB was undergoing a liquidity shortage. The message apparently also said that the government deposit guarantee fund was under-capitalised to meet possible repayments, that banks could go bankrupt and that the peg of the currency with the euro could be broken. Allegedly, the alert was diffused by text, email and even Facebook messages, thus ensuring a very widespread outreach. In a country that in 1997 underwent a very serious banking crisis featuring all these characteristics – whose memory is still fresh – this was enough to spur panic.

    (tags: spam banking bulgaria banks euro panic facebook social-media)

  • New Russian Law To Forbid Storing Russians' Data Outside the Country - Slashdot

    On Friday Russia's parliament passed a law "which bans online businesses from storing personal data of Russian citizens on servers located abroad[.] ... According to ITAR-TASS, the changes to existing legislation will come into effect in September 2016, and apply to email services, social networks and search engines, including the likes of Facebook and Google. Domain names or net addresses not complying with regulations will be put on a blacklist maintained by Roskomnadzor (the Federal Supervision Agency for Information Technologies and Communications), the organisation which already has the powers to take down websites suspected of copyright infringement without a court order. In the case of non-compliance, Roskomnadzor will be able to impose 'sanctions,' and even instruct local Internet Service Providers (ISPs) to cut off access to the offending resource."

    (tags: russia privacy nsa censorship protectionism internet web)

Links for 2014-07-04

  • Irish parliament pressing ahead with increased access to retained telecoms data

    While much of the new bill is concerned with the dissolution of the Competition Authority and the National Consumer Agency and the formation of a new merged Competition and Consumer Protection Commission (CCPC) the new bill also proposed to extend the powers of the new CCPC to help it investigate serious anticompetitive behaviour. Strikingly the new bill proposes to give members of the CCPC the power to access data retained under the Communications (Retention of Data) Act 2011. As readers will recall this act implements Directive 2006/24/EC which obliges telecommunications companies to archive traffic and location data for a period of up to two years to facilitate the investigation of serious crime. Ireland chose to implement the maximum two year retention period and provided access to An Garda Siochana, The Defence Forces and the Revenue Commissioners. The current reform of Irish competition law now proposes to extend data access powers to the members of the CCPC for the purposes of investigating cartel offences.

    (tags: data-retention privacy surveillance competition ccpc ireland law dri)

  • NSA: Linux Journal is an "extremist forum" and its readers get flagged for extra surveillance

    DasErste.de has published the relevant XKEYSCORE source code, and if you look closely at the rule definitions, you will see linuxjournal.com/content/linux* listed alongside Tails and Tor. According to an article on DasErste.de, the NSA considers Linux Journal an "extremist forum". This means that merely looking for any Linux content on Linux Journal, not just content about anonymizing software or encryption, is considered suspicious and means your Internet traffic may be stored indefinitely.
    This is, sadly, entirely predictable -- that's what happens when you optimize the system for over-sampling, with poor oversight.

    (tags: false-positives linuxjournal linux terrorism tor tails nsa surveillance snooping xkeyscore selectors oversight)

  • stout

    a C++ library adding some modern language features like Option, Try, Stopwatch, and other Guava-ish things (via @cscotta)

    (tags: c++ library stout option try guava coding)

Links for 2014-07-03

Links for 2014-07-01

Links for 2014-06-30

  • Facebook Doesn't Understand The Fuss About Its Emotion Manipulation Study

    This is quite unethical, and I'm amazed it was published at all. Kashmir Hill at Forbes nails it:

    While many users may already expect and be willing to have their behavior studied — and while that may be warranted with “research” being one of the 9,045 words in the data use policy — they don’t expect that Facebook will actively manipulate their environment in order to see how they react. That’s a new level of experimentation, turning Facebook from a fishbowl into a petri dish, and it’s why people are flipping out about this.
    Shocking stuff. We need a new social publishing platform, built on ethical, open systems.

    (tags: ethics facebook privacy academia depression feelings emotion social-publishing social experimentation papers)

  • Building a Smarter Application Stack - DevOps Ireland

    This sounds like a very interesting Dublin meetup -- Engine Yard on thursday night:

    This month, we'll have Tomas Doran from Yelp talking about Docker, service discovery, and deployments. 'There are many advantages to a container based, microservices architecture - however, as always, there is no silver bullet. Any serious deployment will involve multiple host machines, and will have a pressing need to migrate containers between hosts at some point. In such a dynamic world hard coding IP addresses, or even host names is not a viable solution. This talk will take a journey through how Yelp has solved the discovery problems using Airbnb’s SmartStack to dynamically discover service dependencies, and how this is helping unify our architecture, from traditional metal to EC2 ‘immutable’ SOA images, to Docker containers.'

    (tags: meetups talks dublin deployment smartstack ec2 docker yelp service-discovery)

  • Smart Integration Testing with Dropwizard, Flyway and Retrofit

    Retrofit in particular looks neat. Mind you having worked with in-memory SQL databases before for integration testing, I'd never do that again -- too many interop glitches compared to "real world" MySQL/Postgres

    (tags: testing integration-testing retrofit flyway dropwizard logentries)

  • Twitter's TSAR

    TSAR = "Time Series AggregatoR". Twitter's new event processor-style architecture for internal metrics. It's notable that now Twitter and Google are both apparently moving towards this idea of a model of code which is designed to run equally in realtime streaming and batch modes (Summingbird, Millwheel, Flume).

    (tags: analytics architecture twitter tsar aggregation event-processing metrics streaming hadoop batch)

  • 'Robust De-anonymization of Large Sparse Datasets' [pdf]

    paper by Arvind Narayanan and Vitaly Shmatikov, 2008. 'We present a new class of statistical de- anonymization attacks against high-dimensional micro-data, such as individual preferences, recommendations, transaction records and so on. Our techniques are robust to perturbation in the data and tolerate some mistakes in the adversary's background knowledge. We apply our de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix, the world's largest online movie rental service. We demonstrate that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber's record in the dataset. Using the Internet Movie Database as the source of background knowledge, we successfully identified the Netflix records of known users, uncovering their apparent political preferences and other potentially sensitive information.'

    (tags: anonymisation anonymization sanitisation databases data-dumps privacy security papers)

  • HSE data releases may be de-anonymisable

    Although the data has been kept anonymous, the increasing sophistication of computer-driven data-mining techniques has led to fears patients could be identified. A HSE spokesman confirmed yesterday that the office responded to requests for data from a variety of sources, including researchers, the universities, GPs, the media, health insurers and pharmaceutical companies. An average of about two requests a week was received. [...] The information provided by the HPO has significant patient identifiers removed, such as name and date of birth. According to the HSE spokesman, individual patient information is not provided and, where information is sought for a small group of patients, this is not provided where the number involved is under five. “In such circumstances, it is highly unlikely that anyone could be identified. Nevertheless, we will have another look at data releases from the office,” he said.
    I'd say this could be readily reversible, from the sounds of it.

    (tags: anonymisation sanitisation data-dumps hse health privacy via:tjmcintyre)

  • Beautiful algorithm visualisations from Mike Bostock

    This is a few days old, but unmissable. I swear, the 'Wilson's algorithm transformed into a tidy tree layout' viz brought tears to my eyes ;)

    (tags: dataviz algorithms visualization visualisation mazes trees sorting animation mike-bostock)

  • ByteArrayOutputStream is really, really slow sometimes in JDK6

    This leads us to the bug. The size of the array is determined by Math.max(buf.length << 1, newcount). Ordinarily, buf.length << 1 returns double buf.length, which would always be much larger than newcount for a 2 byte write. Why was it not? The problem is that for all integers larger than Integer.MAX_INTEGER / 2, shifting left by one place causes overflow, setting the sign bit. The result is a negative integer, which is always less than newcount. So for all byte arrays larger than 1073741824 bytes (i.e. one GB), any write will cause the array to resize, and only to exactly the size required.
    Ouch.

    (tags: bugs java jdk6 bytearrayoutputstream impala performance overflow)

  • Cory Doctorow on Thomas Piketty's 'Capital in the 21st Century'

    quite a leftie analysis

    (tags: history capitalism economics piketty capital finance taxation growth money cory-doctorow thomas-piketty)

  • ThreadSanitizer

    Google's purify/valgrind-like concurrency checking tool: 'As a bonus, ThreadSanitizer finds some other types of bugs: thread leaks, deadlocks, incorrect uses of mutexes, malloc calls in signal handlers, and more. It also natively understands atomic operations and thus can find bugs in lock-free algorithms. [...] The tool is supported by both Clang and GCC compilers (only on Linux/Intel64). Using it is very simple: you just need to add a -fsanitize=thread flag during compilation and linking. For Go programs, you simply need to add a -race flag to the go tool (supported on Linux, Mac and Windows).'

    (tags: concurrency bugs valgrind threadsanitizer threading deadlocks mutexes locking synchronization coding testing)

Links for 2014-06-27

  • Sandymount Repair Cafe

    'A repair café brings together people with things that need fixin' with people who have the skills to fix them in a social cafe style environment. It is an effort to move away from the throwaway culture that prevailed at the end of the twentieth century and move towards a more sustainable and enlightened approach to our relationship with consumer goods. Repair cafes are self organising events at a community level run by local volunteers with the support of local community groups, local agencies and other interested organisations. They are not-for-profit but not anti-profit and an important part of their goal is to promote local repair businesses and initiatives. www.repaircafe.ie is the online hub of a network of repair cafés across Ireland.' Sounds interesting: https://twitter.com/DubCityCouncil/status/481777655445204992 says they'll be doing it tomorrow from 2-5pm in Sandymount in Dublin.

    (tags: dublin sandymount repair fixing diy frugality repaircafe hardware)

  • Chef Vault

    A way to securely store secrets (auth details, API keys, etc.) in Chef

    (tags: chef storage knife authorisation api-keys security encryption)

  • Amazon EC2 Service Limits Report Now Available

    'designed to make it easier for you to view and manage your limits for Amazon EC2 by providing the latest information on service limits and links to quickly request limit increases. EC2 Service Limits Report displays all your service limit information in one place to help you avoid encountering limits on future EC2, EBS, Auto Scaling, and VPC usage.'

    (tags: aws ec2 vpc ebs autoscaling limits ops)

  • Delivery Notifications for Simple Email Service

    Today we are enhancing SES with the addition of delivery notifications. You can now elect to receive an Amazon SNS notification each time SES successfully delivers a message to a recipient's email server. These notifications give you increased visibility into the mail delivery process. With today's release, you can now track deliveries, bounces, and complaints, all via notification to the SNS topic or topics of your choice.

    (tags: delivery email smtp ses aws sns notifications ops)

  • How Emoji Get Lost In Translation

    I recently texted a friend to say how I was excited to meet her new boyfriend, and, because "excited" doesn't look so exciting on an iPhone screen, I editorialized with what seemed then like an innocent "[dancer]". (Translation: Can't wait for the fun night out!) On an Android phone, I realized later, that panache would have been a put-down: The dancers become "[playboy bunny]." (Translation: You’re a Playboy bunny who gets around!)

    (tags: emoji icons graphics text speech phones)

Links for 2014-06-26

Links for 2014-06-25

Links for 2014-06-24

Links for 2014-06-23

  • Startup equity gotcha

    'Two months ago, an early Uber employee thought that he had found a buyer for his vested stock, at $200 per share. But when his agent tried to seal the deal, Uber refused to sign off on the transfer. Instead, it offered to buy back the shares for around $135 a piece, which is within the same price range that Google Ventures and TPG Capital had paid to invest in Uber the previous July. Take it or hold it.' As rbranson on Twitter put it: 'reminder that startup equity is basically worthless unless you're a founder or investor, OR the company goes public.'

    (tags: startups uber stock stock-options shares share-option equity via:rbranson work)

Links for 2014-06-20

Links for 2014-06-19

Links for 2014-06-18

Links for 2014-06-17

  • FlatBuffers: Main Page

    A new serialization format from Google's Android gaming team, supporting C++ and Java, open source under the ASL v2. Reasons to use it:

    Access to serialized data without parsing/unpacking - What sets FlatBuffers apart is that it represents hierarchical data in a flat binary buffer in such a way that it can still be accessed directly without parsing/unpacking, while also still supporting data structure evolution (forwards/backwards compatibility). Memory efficiency and speed - The only memory needed to access your data is that of the buffer. It requires 0 additional allocations. FlatBuffers is also very suitable for use with mmap (or streaming), requiring only part of the buffer to be in memory. Access is close to the speed of raw struct access with only one extra indirection (a kind of vtable) to allow for format evolution and optional fields. It is aimed at projects where spending time and space (many memory allocations) to be able to access or construct serialized data is undesirable, such as in games or any other performance sensitive applications. See the benchmarks for details. Flexible - Optional fields means not only do you get great forwards and backwards compatibility (increasingly important for long-lived games: don't have to update all data with each new version!). It also means you have a lot of choice in what data you write and what data you don't, and how you design data structures. Tiny code footprint - Small amounts of generated code, and just a single small header as the minimum dependency, which is very easy to integrate. Again, see the benchmark section for details. Strongly typed - Errors happen at compile time rather than manually having to write repetitive and error prone run-time checks. Useful code can be generated for you. Convenient to use - Generated C++ code allows for terse access & construction code. Then there's optional functionality for parsing schemas and JSON-like text representations at runtime efficiently if needed (faster and more memory efficient than other JSON parsers).
    Looks nice, but it misses the language coverage of protobuf. Definitely more practical than capnproto.

    (tags: c++ google java serialization json formats protobuf capnproto storage flatbuffers)

  • AWS SDK for Java Client Configuration

    turns out the AWS SDK has lots of tuning knobs: region selection, socket buffer sizes, and debug logging (including wire logging).

    (tags: aws sdk java logging ec2 s3 dynamodb sockets tuning)

  • Behind the loom band

    The simple woven multicoloured bracelet has made Cheong Choon Ng, a Malaysian immigrant to the US, a dollar millionaire. He invented the "Rainbow Loom" after watching his daughters making bracelets with rubber bands.
    So, really, it's his daughters that invented it. ;) My kids are massive fans. This is a 100% legit, Rubik's-Cube-style craze. (via Conor O'Neill)

    (tags: via:conoro loom-bands rubber-bands toys crazes)

  • lookout/ngx_borderpatrol

    BorderPatrol is an nginx module to perform authentication and session management at the border of your network. BorderPatrol makes the assumption that you have some set of services that require authentication and a service that hands out tokens to clients to access that service. You may not want those tokens to be sent across the internet, even over SSL, for a variety of reasons. To this end, BorderPatrol maintains a lookup table of session-id to auth token in memcached.

    (tags: borderpatrol nginx modules authentication session-management web-services http web authorization)

  • Use of Formal Methods at Amazon Web Services

    Chris Newcombe, Marc Brooker, et al. writing about their experience using formal specification and model-checking languages (TLA+) in production in AWS:

    The success with DynamoDB gave us enough evidence to present TLA+ to the broader engineering community at Amazon. This raised a challenge; how to convey the purpose and benefits of formal methods to an audience of software engineers? Engineers think in terms of debugging rather than ‘verification’, so we called the presentation “Debugging Designs”. Continuing that metaphor, we have found that software engineers more readily grasp the concept and practical value of TLA+ if we dub it 'Exhaustively-testable pseudo-code'. We initially avoid the words ‘formal’, ‘verification’, and ‘proof’, due to the widespread view that formal methods are impractical. We also initially avoid mentioning what the acronym ‘TLA’ stands for, as doing so would give an incorrect impression of complexity.
    More slides at http://tla2012.loria.fr/contributed/newcombe-slides.pdf ; proggit discussion at http://www.reddit.com/r/programming/comments/277fbh/use_of_formal_methods_at_amazon_web_services/

    (tags: formal-methods model-checking tla tla+ programming distsys distcomp ebs s3 dynamodb aws ec2 marc-brooker chris-newcombe)

  • Call me maybe: RabbitMQ

    We used Knossos and Jepsen to prove the obvious: RabbitMQ is not a lock service. That investigation led to a discovery hinted at by the documentation: in the presence of partitions, RabbitMQ clustering will not only deliver duplicate messages, but will also drop huge volumes of acknowledged messages on the floor. This is not a new result, but it may be surprising if you haven’t read the docs closely–especially if you interpreted the phrase “chooses Consistency and Partition Tolerance” to mean, well, either of those things.

    (tags: rabbitmq network partitions failure cap-theorem consistency ops reliability distcomp jepsen)

  • Jump Consistent Hash: A Fast, Minimal Memory, Consistent Hash Algorithm

    'a fast, minimal memory, consistent hash algorithm that can be expressed in about 5 lines of code. In comparison to the algorithm of Karger et al., jump consistent hash requires no storage, is faster, and does a better job of evenly dividing the key space among the buckets and of evenly dividing the workload when the number of buckets changes. Its main limitation is that the buckets must be numbered sequentially, which makes it more suitable for data storage applications than for distributed web caching.' Implemented in Guava. This is also noteworthy: 'Google has not applied for patent protection for this algorithm, and, as of this writing, has no plans to. Rather, it wishes to contribute this algorithm to the community.'

    (tags: hashing consistent-hashing google guava memory algorithms sharding)

  • Bike Wheel Spoke ABS Safety Reflective Tube Reflector

    Available in blue, orange, and grey for $2.84 from the insanely-cheap China-based DealExtreme.com. Also available: rim-based reflective stickers

    (tags: bikes cycling reflective safety dealextreme tat)

Links for 2014-06-16

Links for 2014-05-29

  • Tracedump

    a single application IP packet sniffer that captures all TCP and UDP packets of a single Linux process. It consists of the following elements: * ptrace monitor - tracks bind(), connect() and sendto() syscalls and extracts local port numbers that the traced application uses; * pcap sniffer - using information from the previous module, it captures IP packets on an AF_PACKET socket (with an appropriate BPF filter attached); * garbage collector - periodically reads /proc/net/{tcp,udp} files in order to detect the sockets that the application no longer uses. As the output, tracedump generates a PCAP file with SLL-encapsulated IP packets - readable by eg. Wireshark. This file can be later used for detailed analysis of the networking operations made by the application. For instance, it might be useful for IP traffic classification systems.

    (tags: debugging networking linux strace ptrace tracedump tracing tcp udp sniffer ip tcpdump)

  • You Are Not a Digital Native: Privacy in the Age of the Internet

    an open letter from Cory Doctorow to teen readers re privacy. 'The problem with being a “digital native” is that it transforms all of your screw-ups into revealed deep truths about how humans are supposed to use the Internet. So if you make mistakes with your Internet privacy, not only do the companies who set the stage for those mistakes (and profited from them) get off Scot-free, but everyone else who raises privacy concerns is dismissed out of hand. After all, if the “digital natives” supposedly don’t care about their privacy, then anyone who does is a laughable, dinosauric idiot, who isn’t Down With the Kids.'

    (tags: children privacy kids teens digital-natives surveillance cory-doctorow danah-boyd)

  • Shutterbits replacing hardware load balancers with local BGP daemons and anycast

    Interesting approach. Potentially risky, though -- heavy use of anycast on a large-scale datacenter network could increase the scale of the OSPF graph, which scales exponentially. This can have major side effects on OSPF reconvergence time, which creates an interesting class of network outage in the event of OSPF flapping. Having said that, an active/passive failover LB pair will already announce a single anycast virtual IP anyway, so, assuming there are a similar number of anycast IPs in the end, it may not have any negative side effects. There's also the inherent limitation noted in the second-to-last paragraph; 'It comes down to what your hardware router can handle for ECMP. I know a Juniper MX240 can handle 16 next-hops, and have heard rumors that a software update will bump this to 64, but again this is something to keep in mind'. Taking a leaf from the LB design, and using BGP to load-balance across a smaller set of haproxy instances, would seem like a good approach to scale up.

    (tags: scalability networking performance load-balancing bgp exabgp ospf anycast routing datacenters scaling vips juniper haproxy shutterstock)

  • Tron: Legacy Encom Boardroom Visualization

    this is great. lovely, silly, HTML5 dataviz, with lots of spinning globes and wobbling sines on a black background

    (tags: demo github wikipedia dataviz visualisation mapping globes rob-scanlan graphics html5 animation tron-legacy tron movies)

  • CockroachDB

    a distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services. Cockroach implements a single, monolithic sorted map from key to value where both keys and values are byte strings (not unicode). Cockroach scales linearly (theoretically up to 4 exabytes (4E) of logical data). The map is composed of one or more ranges and each range is backed by data stored in RocksDB (a variant of LevelDB), and is replicated to a total of three or more cockroach servers. Ranges are defined by start and end keys. Ranges are merged and split to maintain total byte size within a globally configurable min/max size interval. Range sizes default to target 64M in order to facilitate quick splits and merges and to distribute load at hotspots within a key range. Range replicas are intended to be located in disparate datacenters for survivability (e.g. { US-East, US-West, Japan }, { Ireland, US-East, US-West}, { Ireland, US-East, US-West, Japan, Australia }). Single mutations to ranges are mediated via an instance of a distributed consensus algorithm to ensure consistency. We’ve chosen to use the Raft consensus algorithm. All consensus state is stored in RocksDB. A single logical mutation may affect multiple key/value pairs. Logical mutations have ACID transactional semantics. If all keys affected by a logical mutation fall within the same range, atomicity and consistency are guaranteed by Raft; this is the fast commit path. Otherwise, a non-locking distributed commit protocol is employed between affected ranges. Cockroach provides snapshot isolation (SI) and serializable snapshot isolation (SSI) semantics, allowing externally consistent, lock-free reads and writes--both from an historical snapshot timestamp and from the current wall clock time. SI provides lock-free reads and writes but still allows write skew. SSI eliminates write skew, but introduces a performance hit in the case of a contentious system. SSI is the default isolation; clients must consciously decide to trade correctness for performance. Cockroach implements a limited form of linearalizability, providing ordering for any observer or chain of observers.
    This looks nifty. One to watch.

    (tags: cockroachdb databases storage georeplication raft consensus acid go key-value-stores rocksdb)

  • Tuning LevelDB

    good docs from Riak

    (tags: leveldb tuning performance ops riak)

  • Proof of burn - Bitcoin

    method for bootstrapping one cryptocurrency off of another. The idea is that miners should show proof that they burned some coins - that is, sent them to a verifiably unspendable address. This is expensive from their individual point of view, just like proof of work; but it consumes no resources other than the burned underlying asset. To date, all proof of burn cryptocurrencies work by burning proof-of-work-mined cryptocurrencies, so the ultimate source of scarcity remains the proof-of-work-mined "fuel".

    (tags: bitcoin proof money mining cryptocurrency)

  • The programming error that cost Mt Gox 2609 bitcoins

    Digging into broken Bitcoin scripts in the blockchain. Fascinating:

    While analyzing coinbase transactions, I came across another interesting bug that lost bitcoins. Some transactions have the meaningless and unredeemable script: OP_IFDUP OP_IF OP_2SWAP OP_VERIFY OP_2OVER OP_DEPTH That script turns out to be the ASCII text script. Instead of putting the redemption script into the transaction, the P2Pool miners accidentally put in the literal word "script". The associated bitcoins are lost forever due to this error.
    (via Nelson)

    (tags: programming script coding bitcoin mtgox via:nelson scripting dsls)

  • Moquette MQTT

    a Java implementation of an MQTT 3.1 broker. Its code base is small. At its core, Moquette is an events processor; this lets the code base be simple, avoiding thread sharing issues. The Moquette broker is lightweight and easy to understand so it could be embedded in other projects.

    (tags: mqtt moquette netty messaging queueing push-notifications iot internet push eclipse)

  • "Taking the hotdog"

    aka. lock acquisition. ex-Amazon-Dublin lingo, observed in the wild ;)

    (tags: language hotdog archie-mcphee amazon dublin intercom coding locks synchronization)

Links for 2014-05-27

Links for 2014-05-26

Links for 2014-05-23

  • BPF - the forgotten bytecode

    'In essence Tcpdump asks the kernel to execute a BPF program within the kernel context. This might sound risky, but actually isn't. Before executing the BPF bytecode kernel ensures that it's safe: * All the jumps are only forward, which guarantees that there aren't any loops in the BPF program. Therefore it must terminate. * All instructions, especially memory reads are valid and within range. * The single BPF program has less than 4096 instructions. All this guarantees that the BPF programs executed within kernel context will run fast and will never infinitely loop. That means the BPF programs are not Turing complete, but in practice they are expressive enough for the job and deal with packet filtering very well.' Good example of a carefully-designed DSL allowing safe "programs" to be written and executed in a privileged context without security risk, or risk of running out of control.

    (tags: coding dsl security via:oisin linux tcpdump bpf bsd kernel turing-complete configuration languages)

  • Handmade Kitchen Goods from Makers & Brothers - Cool Hunting

    lovely kitchen-gear design from local-boys-made-good Makers & Brothers

    (tags: makers-and-brothers design crafts kitchen nyc terrazo chopping-boards)

Links for 2014-05-22

  • 'Monitoring and detecting causes of failures of network paths', US patent 8,661,295 (B1)

    The first software patent in my name -- couldn't avoid it forever :(

    Systems and methods are provided for monitoring and detecting causes of failures of network paths. The system collects performance information from a plurality of nodes and links in a network, aggregates the collected performance information across paths in the network, processes the aggregated performance information for detecting failures on the paths, analyzes each of the detected failures to determine at least one root cause, and initiates a remedial workflow for the at least one root cause determined. In some aspects, processing the aggregated information may include performing a statistical regression analysis or otherwise solving a set of equations for the performance indications on each of a plurality of paths. In another aspect, the system may also include an interface which makes available for display one or more of the network topology, the collected and aggregated performance information, and indications of the detected failures in the topology.
    The patent describes an early version of Pimms, the network failure detection and remediation system we built for Amazon.

    (tags: amazon pimms swpats patents networking ospf autoremediation outage-detection)

Links for 2014-05-16

Links for 2014-05-14

Links for 2014-05-13

Links for 2014-05-12

Links for 2014-05-09