Skip to content

Justin's Linklog Posts

Links for 2016-03-27

  • Jenkins 2.0

    built-in support for CI/CD deployment pipelines, driven from a checked-in DSL file. great stuff, very glad to see them going this direction. (via Eric)

    (tags: via:eric jenkins ci cd deployment pipelines testing automation build)

  • Hey Microsoft, the Internet Made My Bot Racist, Too

    All machine learning algorithms strive to exaggerate and perpetuate the past. That is, after all, what they are learning from. The fundamental assumption of every machine learning algorithm is that the past is correct, and anything coming in the future will be, and should be, like the past. This is a fine assumption to make when you are Netflix trying to predict what movie you’ll like, but is immoral when applied to many other situations. For bots like mine and Microsoft’s, built for entertainment purposes, it can lead to embarrassment. But AI has started to be used in much more meaningful ways: predictive policing in Chicago, for example, has already led to widespread accusations of racial profiling. This isn’t a little problem. This is a huge problem, and it demands a lot more attention then it’s getting now, particularly in the community of scientists and engineers who design and apply these algorithms. It’s one thing to get cursed out by an AI, but wholly another when one puts you in jail, denies you a mortgage, or decides to audit you.

    (tags: machine-learning ml algorithms future society microsoft)

Links for 2016-03-25

  • Tahoe LAFS accidentally lose Bitcoin wallet with loads of donations in it, get it back

    But ECDSA private keys don’t trigger the same protective instincts that we’d apply to, say, a bar of gold. One sequence of 256 random bits looks just as worthless as any other. And the cold hard unforgeability of these keys means we can’t rely upon other humans to get our money back when we lose them. Plus, we have no experience at all with things that grow in value by four orders of magnitude, without any attention, in just three years. So we have a cryptocurrency-tool UX task in front of us: to avoid mistakes like the one we made, we must to either move these digital assets into solid-feeling physical containers, or retrain our perceptions to attach value to the key strings themselves.

    (tags: backups cryptography bitcoin cryptocurrency ecdsa private-keys ux money)

  • Visual Representation of SQL Joins

    useful bookmark to have (via Nelson)

    (tags: sql joins mysql reference database)

  • Interesting Lottery Terminal Hack – Schneier on Security

    Neat manual timing attack.

    An investigator for the Connecticut Lottery determined that terminal operators could slow down their lottery machines by requesting a number of database reports or by entering several requests for lottery game tickets. While those reports were being processed, the operator could enter sales for 5 Card Cash tickets. Before the tickets would print, however, the operator could see on a screen if the tickets were instant winners. If tickets were not winners, the operator could cancel the sale before the tickets printed.

    (tags: attacks security lottery connecticut kiosks)

Links for 2016-03-24

Links for 2016-03-23

Links for 2016-03-20

  • Modern Irish genome closely matches pre-Celt DNA, not Celtic

    Radiocarbon dating shows that the bones discovered at McCuaig’s go back to about 2000 B.C. That makes them hundreds of years older than the oldest artifacts generally considered to be Celtic — relics unearthed from Celt homelands of continental Europe, most notably around Switzerland, Austria and Germany. For a group of scholars who in recent years have alleged that the Celts, beginning from the middle of Europe, may never have reached Ireland, the arrival of the DNA evidence provides the biological certitude that the science has sometimes brought to criminal trials. “With the genetic evidence, the old model [of Celtic colonisation of Ireland] is completely shot,” John Koch, a linguist at the Center for Advanced Welsh and Celtic Studies at the University of Wales.

    (tags: celts ireland history dna genetics genome carbon-dating bronze-age europe colonisation)

Links for 2016-03-18

  • GCM XMPP delivery receipt not always received – Google Groups

    Good to know:

    ‘GCM delivery receipts don’t have an SLA at this time. Having your connection open longer will increase the odds that delivery receipts will arrive. 10 seconds seems a bit short. I’m glad it works. I would recommend longer like 10 min or an hour. The real design of this system is for persistent connections, hence connections that setup and tear down frequently will have difficulty receiving delivery receipts.’

    (tags: gcm xmpp receipts messaging push-notifications google)

Links for 2016-03-16

  • The disturbingly simple way dozens of celebrities had their nude photos stolen

    Basic phishing: ‘Collins hacked over 100 people by sending emails that looked like they came from Apple and Google, such as “e-mail.protection318@icloud.com,” “noreply_helpdesk0118@outlook.com,” and “secure.helpdesk0019@gmail.com.” According to the government, Collins asked for his victims’ iCloud or Gmail usernames and passwords and “because of the victims’ belief that the email had come from their [Internet Service Providers], numerous victims responded by giving [them].”’

    (tags: security phishing nudes fappening celebs gmail icloud apple)

  • RFC 7754 – Technical Considerations for Internet Service Blocking and Filtering

    The Internet is structured to be an open communications medium. This openness is one of the key underpinnings of Internet innovation, but it can also allow communications that may be viewed as undesirable by certain parties. Thus, as the Internet has grown, so have mechanisms to limit the extent and impact of abusive or objectionable communications. Recently, there has been an increasing emphasis on “blocking” and “filtering”, the active prevention of such communications. This document examines several technical approaches to Internet blocking and filtering in terms of their alignment with the overall Internet architecture. When it is possible to do so, the approach to blocking and filtering that is most coherent with the Internet architecture is to inform endpoints about potentially undesirable services, so that the communicants can avoid engaging in abusive or objectionable communications. We observe that certain filtering and blocking approaches can cause unintended consequences to third parties, and we discuss the limits of efficacy of various approaches.
    (via Tony Finch)

    (tags: via:fanf blocking censorship filtering internet rfcs rfc isps)

  • The Three Go Landmines

    ‘There are three easy to make mistakes in go. I present them here in the way they are often found in the wild, not in the way that is easiest to understand. All three of these mistakes have been made in Kubernetes code, getting past code review at least once each that I know of.’

    (tags: k8s go golang errors coding bugs)

  • Health of purebred vs mixed breed dogs: the actual data – The Institute of Canine Biology

    This study found that purebred dogs have a significantly greater risk of developing many of the hereditary disorders examined in this study. No, mixed breed dogs are not ALWAYS healthier than purebreds; and also, purebreds are not “as healthy” as mixed breed dogs. The results of this study will surprise nobody who understands the basics of Mendelian inheritance. Breeding related animals increases the expression of genetic disorders caused by recessive mutations, and it also increases the probability of producing offspring that will inherit the assortment of genes responsible for a polygenic disorder. 
    In conclusion, go mutts.

    (tags: dogs breeding genetics hereditary-disorders science inheritance recessive-mutation data)

Links for 2016-03-15

  • DeepMind founder Demis Hassabis on how AI will shape the future | The Verge

    Good interview with Demis Hassabis on DeepMind, AlphaGo and AI:

    I’d like to see AI-assisted science where you have effectively AI research assistants that do a lot of the drudgery work and surface interesting articles, find structure in vast amounts of data, and then surface that to the human experts and scientists who can make quicker breakthroughs. I was giving a talk at CERN a few months ago; obviously they create more data than pretty much anyone on the planet, and for all we know there could be new particles sitting on their massive hard drives somewhere and no-one’s got around to analyzing that because there’s just so much data. So I think it’d be cool if one day an AI was involved in finding a new particle.

    (tags: ai deepmind google alphago demis-hassabis cern future machine-learning)

  • Before the Split

    Good post on Dublin City Council’s atrociously revisionist 1916-commemoration banner, celebrating Henry Grattan, Daniel O’Connell, Charles Stewart Parnell and John Redmond:

    The banner is not showing parliamentary nationalists who might be included in a history of 1916 (Redmond might have been joined by John Dillon and Tom Kettle, for instance), but displaying the parliamentarian tradition in Irish political history. The people chosen all worked for change via political means, whether obtaining an independent Irish parliament from 1782-1801 (Grattan), working for Catholic Emancipation (Grattan and O’Connell), land reform (Parnell), or trying to repeal the Act of Union and obtain Home Rule (O’Connell, Parnell, Redmond). All were MPs in Westminster at some point. None openly espoused physical force. None aimed at establishing an independent Irish Republic. Putting the history of parliamentarianism on a banner labelled 1916 suggests that 1916 was in the parliamentarian tradition. That suggestion is very far from the truth.

    (tags: parliamentarianism 1916 history revisionism dcc dublin politics)

  • Flow

    a static type checker for Javascript, from Facebook

    (tags: javascript code-analysis coding facebook types strong-types)

Links for 2016-03-08

  • lbzip2

    a free, multi-threaded compression utility with support for bzip2 compressed file format. lbzip2 can process standard bz2 files in parallel. It uses POSIX threading model (pthreads), which allows it to take full advantage of symmetric multiprocessing (SMP) systems. It has been proven to scale linearly, even to over one hundred processor cores. lbzip2 is fully compatible with bzip2 – both at file format and command line level. Files created by lbzip2 can be decompressed by all versions of bzip2 and other software supporting bz2 format. lbzip2 can decompress any bz2 files in parallel. All bzip2 command-line options are also accepted by lbzip2. This makes lbzip2 a drop-in replacement for bzip2.

    (tags: bzip2 gzip compression lbzip2 parallel cli tools)

Links for 2016-03-07

Links for 2016-03-04

Links for 2016-03-03

  • Protect me, I am the Donnybrook laundry

    Mannix Flynn makes a persuasive case to preserve the last remaining Magdalene Laundry still standing:

    Memory is something that fights an eternal battle with the passage of time and forgetfulness.  Time is a great healer for those who can heal and those who are offered healing.  There is no healing here. Time stands still like a festering wound in a well-to-do suburb as somebody attempts to erase a grave and mortal wrong. The McAleese report, the Justice for the Magdalenes, the hundreds of women still alive and their families should know of this place.  Should be present here to witness what can only be witnessed by them.  So that they can understand what’s lost, what cannot be given.  What was taken from them for generations.

    (tags: magdalenes injustice ireland history catholic-church abuse mannix-flynn)

Links for 2016-03-01

Links for 2016-02-29

Links for 2016-02-26

  • Proportional Representation in Ireland: How it Works

    Excellent explanation of PR-STV and the Irish voting system. Don’t be a Plumper! (via John O’Shea)

    (tags: plumpers pr-stv pr voting ireland politics via:joshea)

  • Microsoft warns of risks to Irish operation in US search warrant case

    “Our concern is that if we lose the case more countries across Europe or elsewhere are going to be concerned about having their data in Ireland, ” Mr Smith said, after testifying before the House judiciary committee. Asked what would happen to its Irish unit if the company loses the case or doesn’t convince Congress to pass updated legislation governing cross-border data held by American companies, the Microsoft executive said: “We’ll certainly face a new set of risks that we don’t face today.” He added that the issue could be resolved by an executive order by the White House or through international negotiations between the Irish Government or the European Union and the US.

    (tags: microsoft data privacy us-politics surveillance usa)

  • How To Implement Secure Bitcoin Vaults

    At the Bitcoin workshop in Barbados, Malte Möser will present our solution to the Bitcoin private key management problem. Specifically, our paper describes a way to create vaults, special accounts whose keys can be neutralized if they fall into the hands of attackers. Vaults are Bitcoin’s decentralized version of you calling your bank to report a stolen credit card — it renders the attacker’s transactions null and void. And here’s the interesting part: in so doing, vaults demotivate key theft in the first place. An attacker who knows that he will not be able to get away with theft is less likely to attack in the first place, compared to current Bitcoin attackers who are guaranteed that their hacking efforts will be handsomely rewarded.

    (tags: private-keys vaults bitcoin security crypto theft)

Links for 2016-02-25

  • Maglev: A Fast and Reliable Software Network Load Balancer

    Maglev is Google’s network load balancer. It is a large distributed software system that runs on commodity Linux servers. Unlike traditional hardware network load balancers, it does not require a specialized physical rack deployment, and its capacity can be easily adjusted by adding or removing servers. Network routers distribute packets evenly to the Maglev machines via Equal Cost Multipath (ECMP); each Maglev machine then matches the packets to their corresponding services and spreads them evenly to the service endpoints. To accommodate high and ever-increasing traffic, Maglev is specifically optimized for packet processing performance. A single Maglev machine is able to saturate a 10Gbps link with small packets. Maglev is also equipped with consistent hashing and connection tracking features, to minimize the negative impact of unexpected faults and failures on connection-oriented protocols. Maglev has been serving Google’s traffic since 2008. It has sustained the rapid global growth of Google services, and it also provides network load balancing for Google Cloud Platform.
    Something we argued for quite a lot in Amazon, back in the day….

    (tags: google paper scale ecmp load-balancing via:conall maglev lbs)

  • DIY DOG

    BrewDog releases their beer recipes for free. so cool! ‘So here it is. The keys to our kingdom. Every single BrewDog recipe, ever. So copy them, tear them to pieces, bastardise them, adapt them, but most of all, enjoy them. They are well travelled but with plenty of miles still left on the clock. Just remember to share your brews, and share your results. Sharing is caring.’

    (tags: brewing homebrew beer brewdog open-source free sharing)

  • National Children’s Science Centre due to open in 2018

    Good for science fans, not so hot for real tennis fans.

    The former real tennis court building close to the concert hall’s north wing would be used for temporary and visiting exhibitors, with a tunnel connecting it to the science centre. The National Children’s Science Centre is due to open in late 2018 and will also be known as the Exploration Station, said Dr Danny O’Hare, founding president of Dublin City University and chairman of the Exploration Station board since 2006.

    (tags: real-tennis tennis nch dublin science kids planetarium)

Links for 2016-02-18

  • Neutrino Software Load Balancer

    eBay’s software LB, supporting URL matching, comparable to haproxy, built using Netty and Scala. Used in their QA infrastructure it seems

    (tags: netty scala ebay load-balancing load-balancers url http architecture)

  • This is Why People Fear the ‘Internet of Things’

    Ugh. This is a security nightmare. Nice work Foscam…

    Imagine buying an internet-enabled surveillance camera, network attached storage device, or home automation gizmo, only to find that it secretly and constantly phones home to a vast peer-to-peer (P2P) network run by the Chinese manufacturer of the hardware. Now imagine that the geek gear you bought doesn’t actually let you block this P2P communication without some serious networking expertise or hardware surgery that few users would attempt. This is the nightmare “Internet of Things” (IoT) scenario for any system administrator: The IP cameras that you bought to secure your physical space suddenly turn into a vast cloud network designed to share your pictures and videos far and wide. The best part? It’s all plug-and-play, no configuration necessary!

    (tags: foscam cameras iot security networking p2p)

Links for 2016-02-16

  • The NSA’s SKYNET program may be killing thousands of innocent people

    Death by Random Forest: this project is a horrible misapplication of machine learning. Truly appalling, when a false positive means death:

    The NSA evaluates the SKYNET program using a subset of 100,000 randomly selected people (identified by their MSIDN/MSI pairs of their mobile phones), and a a known group of seven terrorists. The NSA then trained the learning algorithm by feeding it six of the terrorists and tasking SKYNET to find the seventh. This data provides the percentages for false positives in the slide above. “First, there are very few ‘known terrorists’ to use to train and test the model,” Ball said. “If they are using the same records to train the model as they are using to test the model, their assessment of the fit is completely bullshit. The usual practice is to hold some of the data out of the training process so that the test includes records the model has never seen before. Without this step, their classification fit assessment is ridiculously optimistic.” The reason is that the 100,000 citizens were selected at random, while the seven terrorists are from a known cluster. Under the random selection of a tiny subset of less than 0.1 percent of the total population, the density of the social graph of the citizens is massively reduced, while the “terrorist” cluster remains strongly interconnected. Scientifically-sound statistical analysis would have required the NSA to mix the terrorists into the population set before random selection of a subset—but this is not practical due to their tiny number. This may sound like a mere academic problem, but, Ball said, is in fact highly damaging to the quality of the results, and thus ultimately to the accuracy of the classification and assassination of people as “terrorists.” A quality evaluation is especially important in this case, as the random forest method is known to overfit its training sets, producing results that are overly optimistic. The NSA’s analysis thus does not provide a good indicator of the quality of the method.

    (tags: terrorism surveillance nsa security ai machine-learning random-forests horror false-positives classification statistics)

Links for 2016-02-15

  • Lasers reveal ‘lost’ Roman roads

    UK open data success story, via Tony Finch:

    This LIDAR data bonanza has proved particularly helpful to archaeologists seeking to map Roman roads that have been ‘lost’, some for thousands of years. Their discoveries are giving clues to a neglected chapter in the history of Roman Britain: the roads built to help Rome’s legions conquer and control northern England.

    (tags: uk government lidar open-data data roman history mapping geodata)

Links for 2016-02-13

Links for 2016-02-10

Links for 2016-02-09

Links for 2016-02-08

Links for 2016-02-05

  • The science behind “don’t drink when pregnant” is rubbish

    As the economist Emily Oster pointed out in her 2013 book Expecting Better, there is also no “proven safe” level of Tylenol or caffeine, and yet both are fine in moderation during pregnancy. Oster pored through reams of research on alcohol and pregnancy for her book and concluded that there is simply no scientific evidence that light drinking during pregnancy impacts a baby’s health. (In one frequently cited 2001 study that suggested light drinking in pregnancy increases the chances of a child displaying aggressive behaviors, the drinkers were also significantly likelier to have taken cocaine during pregnancy.)
    My wife also followed the paper trail on this issue in the past. In the papers from which these recommendations were derived, the level of drinking at which any effects were observed in babies was when women consumed at least *9 units every day* for the entire pregnancy. That’s an entire bottle of wine, daily!

    (tags: booze alcohol science facts papers medicine emily-oster babies pregnancy pre-pregnant research)

  • GCHQ’s Spam Problem

    ‘“Spam emails are a large proportion of emails seen in SIGINT [signals intelligence],” reads part of a dense document from the Snowden archive, published by Boing Boing on Tuesday. “GCHQ would like to reduce the impact of spam emails on data storage, processing and analysis.”’ (circa 2011). Steganography, anyone? (via Tony Finch)

    (tags: spam anti-spam gchq funny boing-boing sigint snowden surveillance)

  • ECHR: Websites not liable for readers’ comments

    ‘Lawyers for [a Hungarian news] site said the comments concerned had been taken down as soon as they were flagged. They said making their clients liable for everything readers posted “would have serious adverse repercussions for freedom of expression and democratic openness in the age of Internet”. The ECHR agreed. “Although offensive and vulgar, the incriminated comments did not constitute clearly unlawful speech; and they certainly did not amount to hate speech or incitement to violence,” the judges wrote.’

    (tags: echr law eu legal comments index-hu hungary)

  • research!rsc: Zip Files All The Way Down

    quine.zip, quine.gz, and quine.tar.gz. Here’s what happens when you mail it through bad AV software: https://twitter.com/FioraAeterna/status/694655296707297281

    (tags: zip algorithms compression quines fun hacks gzip)

  • The Nuclear Missile Sites of Los Angeles

    Great article by Geoff “bldgblog” Manaugh on the ruins of the Nike air-to-air missile emplacements dotted around California. I had absolutely no idea that these — the 1958-era Nike-Hercules missiles, at least — carried 30-kiloton nuclear warheads, intended to be detonated at 50,000 feet *above* the cities they were defending, in order to destroy in-flight bomber formations. Nuclear war was truly bananas.

    (tags: war history la sf california nike-missiles missiles nuclear-war nike-hercules cold-war 1950s)

Links for 2016-02-03

  • Exclusive: Snowden intelligence docs reveal UK spooks’ malware checklist / Boing Boing

    This is an excellent essay from Cory Doctorow on mass surveillance in the post-Snowden era, and the difference between HUMINT and SIGINT. So much good stuff, including this (new to me) cite for, “Goodhart’s law”, on secrecy as it affects adversarial classification:

    The problem with this is that once you accept this framing, and note the happy coincidence that your paymasters just happen to have found a way to spy on everyone, the conclusion is obvious: just mine all of the data, from everyone to everyone, and use an algorithm to figure out who’s guilty. The bad guys have a Modus Operandi, as anyone who’s watched a cop show knows. Find the MO, turn it into a data fingerprint, and you can just sort the firehose’s output into ”terrorist-ish” and ”unterrorist-ish.” Once you accept this premise, then it’s equally obvious that the whole methodology has to be kept from scrutiny. If you’re depending on three ”tells” as indicators of terrorist planning, the terrorists will figure out how to plan their attacks without doing those three things. This even has a name: Goodhart’s law. “When a measure becomes a target, it ceases to be a good measure.” Google started out by gauging a web page’s importance by counting the number of links they could find to it. This worked well before they told people what they were doing. Once getting a page ranked by Google became important, unscrupulous people set up dummy sites (“link-farms”) with lots of links pointing at their pages.

    (tags: adversarial-classification classification surveillance nsa gchq cory-doctorow privacy snooping goodharts-law google anti-spam filtering spying snowden)

Links for 2016-02-02

Links for 2016-01-30

  • Seesaw: scalable and robust load balancing from Google

    After evaluating a number of platforms, including existing open source projects, we were unable to find one that met all of our needs and decided to set about developing a robust and scalable load balancing platform. The requirements were not exactly complex – we needed the ability to handle traffic for unicast and anycast VIPs, perform load balancing with NAT and DSR (also known as DR), and perform adequate health checks against the backends. Above all we wanted a platform that allowed for ease of management, including automated deployment of configuration changes. One of the two existing platforms was built upon Linux LVS, which provided the necessary load balancing at the network level. This was known to work successfully and we opted to retain this for the new platform. Several design decisions were made early on in the project — the first of these was to use the Go programming language, since it provided an incredibly powerful way to implement concurrency (goroutines and channels), along with easy interprocess communication (net/rpc). The second was to implement a modular multi-process architecture. The third was to simply abort and terminate a process if we ended up in an unknown state, which would ideally allow for failover and/or self-recovery.

    (tags: seesaw load-balancers google load-balancing vips anycast nat lbs go ops networking)

Links for 2016-01-29