-
I like the sound of this — automated Java CMS GC tuning, kind of like a free version of JClarity’s Censum (via Miguel Ángel Pastor)
J. G. Ballard predicted social media in a 1977 essay for Vogue
‘In the intro essay to High Rise it says that J G Ballard predicted social media in a 1977 essay for Vogue. Here it is’
(tags: j-g-ballard social-media twitter instagram youtube future society vogue 1977 facebook media)
Justin's Linklog Posts
Hacked French network exposed its own passwords during TV interview
lols
(tags: passwords post-its fail tv5monde authentication security tv funny)
RADStack – an open source Lambda Architecture built on Druid, Kafka and Samza
‘In this paper we presented the RADStack, a collection of complementary technologies that can be used together to power interactive analytic applications. The key pieces of the stack are Kafka, Samza, Hadoop, and Druid. Druid is designed for exploratory analytics and is optimized for low latency data exploration, aggregation, and ingestion, and is well suited for OLAP workflows. Samza and Hadoop complement Druid and add data processing functionality, and Kafka enables high throughput event delivery.’
(tags: druid samza kafka streaming cep lambda-architecture architecture hadoop big-data olap)
-
an asynchronous Netty based graphite proxy. It protects Graphite from the herds of clients by minimizing context switches and interrupts; by batching and aggregating metrics. Gruffalo also allows you to replicate metrics between Graphite installations for DR scenarios, for example. Gruffalo can easily handle a massive amount of traffic, and thus increase your metrics delivery system availability. At Outbrain, we currently handle over 1700 concurrent connections, and over 2M metrics per minute per instance.
(tags: graphite backpressure metrics outbrain netty proxies gruffalo ops)
Privacy Security Talk in TOG – 22nd April @ 7pm – FREE
Dublin is lucky enough to have great speakers pass through town on occasion and on Wednesday the 22nd April 2015, Runa A. Sandvik (@runasand) and Per Thorsheim (@thorsheim) have kindly offered to speak in TOG from 7pm. The format for the evening is a general meet and greet, but both speakers have offered to give a presentation on a topic of their choice. Anyone one interested in privacy, security, journalism, Tor and/or has previously attended a CryptoParty would be wise to attend. Doors are from 7pm and bring any projects with you you would like to share with other attendees. This is a free event, open to the public and no need to book. See you Wednesday. Runa A. Sandvik is an independent privacy and security researcher, working at the intersection of technology, law and policy. She contributes to The Tor Project, writes for Forbes, and is a technical advisor to both the Freedom of the Press Foundation and the TrueCrypt Audit project. Per Thorsheim as founder/organizer of PasswordsCon.org, his topic of choice is of course passwords, but in a much bigger context than most people imagine. Passwords, pins, biometrics, 2-factor authentication, security/usability and all the way into surveillance and protecting your health, kids and life itself.
(tags: privacy security runa-sandvik per-thorsheim passwords tor truecrypt tog via:oisin events dublin)
-
‘NSW officials seemed more interested in protecting their reputations than the integrity of elections. They sharply criticized Halderman and Teague, rather than commending them, for their discovery of the FREAK attack vulnerability. The Chief Information Officer of the Electoral Commission, Ian Brightwell, claimed Halderman and Teague’s discovery was part of efforts by “well-funded, well-managed anti-internet voting lobby groups,” an apparent reference to our friends at VerifiedVoting.org, where Halderman and Teague are voluntary Advisory Board members.1 Yet at the same time, Brightwell concluded that it was indeed possible that votes were manipulated.’
(tags: freak security vulnerabilities exploits nsw australia internet-voting vvat voting online-voting eff)
Sheets of Glass Cut into Layered Ocean Waves by Ben Young
I particularly love “Rough Waters” — amazing stuff from this kiwi artist
Working Time, Knowledge Work and Post-Industrial Society: Unpredictable Work – Aileen O’Carroll
my friend Aileen has written a book — looks interesting:
I will argue that a key feature of working time within high-tech industries is unpredictability, which alters the way time is experienced and perceived. It affects all aspects of time, from working hours to work organisation, to career, to the distinction between work and life. Although many desire variety in work and the ability to control working hours, unpredictability causes dissatisfaction.
On Amazon.co.uk at: http://www.amazon.co.uk/Working-Time-Knowledge-Post-Industrial-Society-ebook/dp/B00VILIN4U(tags: books reading time work society tech working-hours job life sociology)
Introducing Vector: Netflix’s On-Host Performance Monitoring Tool
It gives pinpoint real-time performance metric visibility to engineers working on specific hosts — basically sending back system-level performance data to their browser, where a client-side renderer turns it into a usable dashboard. Essentially the idea is to replace having to ssh onto instances, run “top”, systat, iostat, and so on.
(tags: vector netflix performance monitoring sysstat top iostat netstat metrics ops dashboards real-time linux)
When S3’s eventual consistency is REALLY eventual
a consistency outage in S3 last year, resulting in about 40 objects failing read-after-write consistency for a duration of about 23 hours
(tags: s3 eventual-consistency aws consistency read-after-writes bugs outages stackdriver)
What is maximum Amazon S3 replication time on file upload? – Stack Overflow
Netflix note a 7 hour consistency delay
(tags: netflix aws s3 consistency eventual-consistency bugs outages)
S3’s “s3-external-1.amazonaws.com” endpoint
public documentation of how to work around the legacy S3 multi-region replication behaviour in North America
(tags: aws s3 eventual-consistency consistency us-east replication workarounds legacy)
A collection of links for streaming algorithms and data structures
Good link-list from Debasish Ghosh
(tags: algorithms streaming big-data streams hll probabilistic data-structures frequency counting sketches cuckoo-filters bloom-filters minhash count-min)
(SEC307) Building a DDoS-Resilient Architecture with AWS
good slides on a “web application firewall” proxy service, deployable as an auto-scaling EC2 unit
(tags: ec2 aws ddos security resilience slides reinvent firewalls http elb)
Germanwings flight 4U9525: what’s it like to listen to a black box recording?
After every air disaster, finding the black box recorder becomes the first priority – but for the crash investigators who have to listen to the tapes of people’s final moments, the experience can be incredibly harrowing.
(tags: flight disasters metrics recording germanwings air-travel black-box-recorder flight-data-recorder death)
Small claims triumph as aerial photographer routs flagrant infringers
This is great news. Flagrant copyright infringement of an aerial photograph penalised to the order of UKP 2,716
(tags: copyright infringement small-claims law uk webb-aviation photography images)
Bad data PR: how the NSPCC sunk to a new low in data churnalism
when the NSPCC sent out a press release saying that one in ten 12-13 year olds [in the UK] are worried that they are addicted to porn and 12% have participated in sexually explicit videos, dozens of journalists appear to have simply played along – despite there being no report and little explanation of where the figures came from. [….] “It turns out the study was conducted by a “creative market research” [ie. pay-per-survey] group calledOnePoll. “Generate content and news angles with a OnePoll PR survey, and secure exposure for your brand,” reads the company’s blurb. “Our PR survey team can help draft questions, find news angles, design infographics, write and distribute your story.” “The OnePoll survey included just 11 multiple-choice questions, which could be filled in online. Children were recruited via their parents, who were already signed up to OnePoll.”
The NSPCC spends 25 million UKP per year on “child protection advice and awareness”, so they have the money to do this right. Disappointing.(tags: nspcc bad-science bad-data methodology surveys porn uk kids addiction onepoll pr market-research)
Stack Overflow Developer Survey 2015
wow, 52.5% of developers prefer a dark IDE theme?!
(tags: coding jobs work careers software stack-overflow surveys)
Gil Tene’s “usual suspects” to reduce system-level hiccups/latency jitters in a Linux system
Based on empirical evidence (across many tens of sites thus far) and note-comparing with others, I use a list of “usual suspects” that I blame whenever they are not set to my liking and system-level hiccups are detected. Getting these settings right from the start often saves a bunch of playing around (and no, there is no “priority” to this – you should set them all right before looking for more advice…).
(tags: performance latency hiccups gil-tene tuning mechanical-sympathy hyperthreading linux ops)
-
I think that materiality means what it says, and if people or algorithms do dumb things with trivial information that’s their problem. But markets are a lot faster and more literal than they were when the materiality standard was created, and I wonder whether regulators or courts will one day decide that materiality is too reasonable a standard for modern markets. The materiality standard depends on the reasonable investor, and in many important contexts the reasonable investor has been replaced by a computer.
(tags: algorithms trading stock stock-market sec materiality april-fools-day tesla investing jokes)
Time Series Metrics with Cassandra
slides from Chris Maxwell of Ubiquiti Networks describing what he had to do to get cyanite on Cassandra handling 30k metrics per second; an experimental “Date-tiered compaction” mode from Spotify was essential from the sounds of it. Very complex :(
(tags: cassandra spotify date-tiered-compaction metrics graphite cyanite chris-maxwell time-series-data)
-
you can use 2-liter carbonated drink bottles to build an inexpensive, reusable water rocket. The thrill factor is surprisingly high, and you can fly them all day long for the cost of a little air and water. It’s the perfect thing for those times when you just want to head down to the local soccer field and shoot off some rockets!
Outages, PostMortems, and Human Error 101
Good basic pres from John Allspaw, covering the basics of tier-one tech incident response — defining the 5 severity levels; root cause analysis techniques (to Five-Whys or not); and the importance of service metrics
(tags: devops monitoring ops five-whys allspaw slides etsy codeascraft incident-response incidents severity root-cause postmortems outages reliability techops tier-one-support)
Twitter’s new anti-harassment filter
Twitter is calling it a “quality filter,” and it’s been rolling out to verified users running Twitter’s iOS app since last week. It appears to work much like a spam filter, except instead of hiding bots and copy-paste marketers, it screens “threats, offensive language, [and] duplicate content” out of your notifications feed.
via Nelson(tags: via:nelson harassment spam twitter gamergame abuse ml)
5% of Google visitors have ad-injecting malware installed
Ad injectors were detected on all operating systems (Mac and Windows), and web browsers (Chrome, Firefox, IE) that were included in our test. More than 5% of people visiting Google sites have at least one ad injector installed. Within that group, half have at least two injectors installed and nearly one-third have at least four installed.
via Nelson.(tags: via:nelson ads google chrome ad-injectors malware scummy)
-
The horrors of monkey-patching:
I call out the Honeybadger gem specifically because was the most recent time I’d been bit by a seemingly good thing promoted in the community: monkey patching third party code. Now I don’t fault Honeybadger for making their product this way. It provides their customers with direct business value: “just require ‘honeybadger’ and you’re done!” I don’t agree with this sort of practice. [….] I distrust everything [in Ruby] but a small set of libraries I’ve personally vetted or are authored by people I respect. Why is this important? Without a certain level of scrutiny you will introduce odd and hard to reproduce bugs. This is especially important because Ruby offers you absolutely zero guarantee whatever the state your program is when a given method is dispatched. Constants are not constants. Methods can be redefined at run time. Someone could have written a time sensitive monkey patch to randomly undefined methods from anything in ObjectSpace because they can. This example is so horribly bad that no one should every do, but the programming language allows this. Much worse, this code be arbitrarily inject by some transitive dependency (do you even know what yours are?).
(tags: ruby monkey-patching coding reliability bugs dependencies libraries honeybadger sinatra)
Science is in crisis and scientists have lost confidence in Government policy
Excellent op-ed from Dr David McConnell, fellow emeritus of TCD’s Smurfit Institute of Genetics: ‘Ireland should once again foster, by competition, a good number of experienced, reputable people, of all ages, who have ideas about solving major scientific questions. These people are an essential part of the foundation of our science-based economy and society. Too many of them are no longer eligible for funding by SFI; too few are being appointed by the universities; and fewer PhDs are being awarded. The writing is on the wall.’
Salutin’ Putin: inside a Russian troll house | World news | The Guardian
file under grim meathook future
(tags: grim-meathook-future guardian russia trolls social-media media censorship livejournal ideology social-control)
-
As a result of a joint investigation of the events surrounding this incident by Google and CNNIC, we have decided that the CNNIC Root and EV CAs will no longer be recognized in Google products.
(tags: cnnic certs ssl tls security certificates pki chrome google)
Llamasoft 8-bit game images now available for download
legal! go Jeff Minter
(tags: jeff-minter llamasoft yaks games history c=64 commodore vic-20 emulation via:shane)
Cassandra remote code execution hole (CVE-2015-0225)
Ah now lads.
Under its default configuration, Cassandra binds an unauthenticated JMX/RMI interface to all network interfaces. As RMI is an API for the transport and remote execution of serialized Java, anyone with access to this interface can execute arbitrary code as the running user.
The Definitive Guide to the Music of The Big Lebowski | LA Weekly
definitive! (via Shero)
(tags: via:shero music the-big-lebowski la-weekly the-dude movies soundtracks)
Reactive Programming for a demanding world
“building event-driven and responsive applications with RxJava”, slides by Mario Fusco. Good info on practical Rx usage in Java
(tags: rxjava rx reactive coding backpressure streams observables)
Chinese authorities compromise millions in cyberattacks
“[The] Great Firewall [of China] has switched from being a passive, inbound filter to being an active and aggressive outbound one.”
(tags: china great-firewall censorship cyberwarfare github ddos baidu future)
Avro, mail # dev – bytes and fixed handling in Python implementation – 2014-09-04, 22:54
More Avro trouble with “bytes” fields! Avoid using “bytes” fields in Avro if you plan to interoperate with either of the Python implementations; they both fail to marshal them into JSON format correctly. This is the official “avro” library, which produces UTF-8 errors when a non-UTF-8 byte is encountered
tebeka / fastavro / issues / #11 – fastavro breaks dumping binary fixed [4] — Bitbucket
The Python “fastavro” library cannot correctly render “bytes” fields. This is a bug, and the maintainer is acting in a really crappy manner in this thread. Avoid this library
(tags: fastavro fail bugs utf-8 bytes encoding asshats open-source python)
A Team of Biohackers Has Figured Out How to Inject Your Eyeballs With Night Vision
Did it work? Yes. It started with shapes, hung about 10 meters away. “I’m talking like the size of my hand,” Licina says. Before long, they were able to do longer distances, recognizing symbols and identifying moving subjects against different backgrounds. “The other test, we had people go stand in the woods,” he says. “At 50 meters, we could figure out where they were, even if they were standing up against a tree.” Each time, Licina had a 100% success rate. The control group, without being dosed with Ce6, only got them right a third of the time.
Well, that’s some risky biohacking. wow(tags: biohacking scary night-vision eyes chlorin-e6 infravision sfm)
Tim Bray on one year as an xoogler
Seems pretty insightful; particularly “I do think the Internet economy would be better and more humane if it didn’t have a single white-hot highly-overprivileged center. Also, sooner or later that’ll stop scaling. Can’t happen too soon.”
(tags: google tim-bray via:nelson xoogler funding tech privacy ads internet)
How I doubled my Internet speed with OpenWRT
File under “silly network hacks”:
Comcast has an initiative called Xfinity WiFi. When you rent a cable modem/router combo from Comcast (as one of my nearby neighbors apparently does), in addition to broadcasting your own WiFi network, it is kind enough to also broadcast “xfinitywifi,” a second “hotspot” network metered separately from your own.
By using his Buffalo WZR-HP-AG300H router’s extra radio, he can load-balance across both his own paid-for connection, and the XFinity WiFi free one. ;)(tags: comcast diy networking openwrt routing home-network hacks xfinity-wifi buffalo)
Unlocking the Power of Stable Teams with Twitter’s SVP of Engineering – First Round Review
Huh. we do this in Swrve — we call them “feature teams”
(tags: feature-team culture development teams coding twitter work teamwork)
How We Scale VividCortex’s Backend Systems – High Scalability
Excellent post from Baron Schwartz about their large-scale, 1-second-granularity time series database storage system
(tags: time-series tsd storage mysql sql baron-schwartz ops performance scalability scaling go)
-
if (creation && object of art && algorithm && one's own algorithm) { include * an algorist * } elseif (!creation || !object of art || !algorithm || !one's own algorithm) { exclude * not an algorist * }
(tags: algorism algorithm art algorists via:belongio)
Nelson’s advice on basic stock option questions
Good advice, and short
(tags: stock share-options shares stock-options via:nelson employment jobs compensation)
-
Race conditions, and errors at startup, seem to be particularly problematic
(tags: race-conditions startup bugs failure fault-tolerance hbase redis reliability ops papers concurrency exception-handling cassandra hdfs mapreduce)
You Cannot Have Exactly-Once Delivery
Cut out and keep:
Within the context of a distributed system, you cannot have exactly-once message delivery. Web browser and server? Distributed. Server and database? Distributed. Server and message queue? Distributed. You cannot have exactly-once delivery semantics in any of these situations.
(tags: distributed distcomp exactly-once-delivery networking outages network-partitions byzantine-generals reference)
What’s confusing about Kafka: a list
At a recent call, Neha said “The most confusing behavior we have is how producing to a topic can return errors for few seconds after the topic was already created”. As she said that, I remembered that indeed, this was once very confusing, but then I got used to it. Which got us thinking: What other things that Kafka does are very confusing to new users, but we got so used to them that we no longer even see the issue?
-
This is the second part of our guide on streaming data and Apache Kafka. In part one I talked about the uses for real-time data streams and explained our idea of a stream data platform. The remainder of this guide will contain specific advice on how to go about building a stream data platform in your organization.
tl;dr: limit the number of Kafka clusters; use Avro.(tags: architecture kafka storage streaming event-processing avro schema confluent best-practices tips)
The Four Month Bug: JVM statistics cause garbage collection pauses (evanjones.ca)
Ugh, tying GC safepoints to disk I/O? bad idea:
The JVM by default exports statistics by mmap-ing a file in /tmp (hsperfdata). On Linux, modifying a mmap-ed file can block until disk I/O completes, which can be hundreds of milliseconds. Since the JVM modifies these statistics during garbage collection and safepoints, this causes pauses that are hundreds of milliseconds long. To reduce worst-case pause latencies, add the -XX:+PerfDisableSharedMem JVM flag to disable this feature. This will break tools that read this file, like jstat.
Gradle Team Perspective on Bazel
interesting.
(tags: gradle bazel build dependencies compilation coding java)
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices
good slides for EMR tuning from re:Invent 2014
-
LOL. grepping commit logs for /bug|fix/ does the job, apparently:
In the literature, Rahman et al. found that a very cheap algorithm actually performs almost as well as some very expensive bug-prediction algorithms. They found that simply ranking files by the number of times they’ve been changed with a bug-fixing commit (i.e. a commit which fixes a bug) will find the hot spots in a code base. Simple! This matches our intuition: if a file keeps requiring bug-fixes, it must be a hot spot because developers are clearly struggling with it.
(tags: bugs rahman-algorithm heuristics source-code-analysis coding algorithms google static-code-analysis version-control)
Build in the Cloud: Accessing Source Code
Google reinvented ClearCase
Cross-Region Replication for Amazon S3
Amazing it took so long
(tags: s3 replication cross-region inter-region aws storage)
ECJ case debates EU citizens’ right to privacy
The US wields secretive and indiscriminate powers to collect data, he said, and had never offered Brussels any commitments to guarantee EU privacy standards for its citizens’ data. On the contrary, said [Max Schrems’ counsel] Mr Hoffmann, “Safe Harbour” provisions could be overruled by US domestic law at any time. Thus he asked the court for a full judicial review of the “illegal” Safe Harbour principles which, he said, violated the essence of privacy and left EU citizens “effectively stripped of any protection”. [Irish] DPC counsel Paul Anthony McDermott SC suggested that Mr Schrems had not been harmed in any way by the status quo. “This is not surprising, given that the NSA isn’t currently interested in the essays of law students in Austria,” he said. Mr Travers for Mr Schrems disagreed, saying “the breach of the right to privacy is itself the harm”.
(tags: ireland dpc data-protection privacy eu ec ecj law rights safe-harbour)
EU-US data pact skewered in court hearing
A lawyer for the European Commission told an EU judge on Tuesday (24 March) he should close his Facebook page if he wants to stop the US snooping on him, in what amounts to an admission that Safe Harbour, an EU-US data protection pact, doesn’t work.
(tags: safe-harbour privacy data-protection ecj eu ec surveillance facebook nsa gchq)
devbook/README.md at master · barsoom/devbook
How to avoid the shitty behaviour of ActiveRecord wrt migration safety, particularly around removing/renaming columns. ugh, ActiveRecord
(tags: activerecord fail rails mysql sql migrations databases schemas releasing)
Papa’s Maze 2.0: a father’s beautifully intricate puzzle for his daughter
Working in a similar fashion – drawing small portions each day – it took Mr. Nomura about 2 months to complete his new maze. And in our humble opinion, we think it’s actually just as beautiful, if not more. It’s not quite as dense and the crisper lines make it easier to perceive the interesting patterns that the maze forms. It’s stunning in graphic quality but it’s also a functioning solvable maze, just like its predecessor. Say hello to Papa’s Maze 2.0. It’s available as a print for $30.
The official REST Proxy for Kafka
The REST Proxy is an open source HTTP-based proxy for your Kafka cluster. The API supports many interactions with your cluster, including producing and consuming messages and accessing cluster metadata such as the set of topics and mapping of partitions to brokers. Just as with Kafka, it can work with arbitrary binary data, but also includes first-class support for Avro and integrates well with Confluent’s Schema Registry. And it is scalable, designed to be deployed in clusters and work with a variety of load balancing solutions. We built the REST Proxy first and foremost to meet the growing demands of many organizations that want to use Kafka, but also want more freedom to select languages beyond those for which stable native clients exist today. However, it also includes functionality beyond traditional clients, making it useful for building tools for managing your Kafka cluster. See the documentation for a more detailed description of the included features.
(tags: kafka rest proxies http confluent queues messaging streams architecture)
-
‘Caffeine is a Java 8 based concurrency library that provides specialized data structures, such as a high performance cache.’
(tags: cache java8 java guava caching concurrency data-structures coding)
Combining static model checking with dynamic enforcement using the Statecall Policy Language
This looks quite nice — a model-checker “for regular programmers”. Example model for ping(1):
01 automaton ping (int max_count, int count, bool can_timeout) { 02 Initialize; 03 during { 04 count = 0; 05 do { 06 Transmit_Ping; 07 either { 08 Receive_Ping; 09 } or (can_timeout) { 10 Timeout_Ping; 11 }; 12 count = count + 1; 13 } until (count >= max_count); 14 } handle { 15 SIGINFO; 16 Print_Summary; 17 };
(tags: ping model-checking models formal-methods verification static dynamic coding debugging testing distcomp papers)
-
good review
(tags: cdt replication distcomp voldemort dynamo riak storage papers)
-
Google open sources a key part of their internal build system (internally called “Blaze” it seems for a while). Very nice indeed!
(tags: blaze bazel build-tools building open-source google coding packaging)
-
a Nix-based continuous build system, released under the terms of the GNU GPLv3 or (at your option) any later version. It continuously checks out sources of software projects from version management systems to build, test and release them. The build tasks are described using Nix expressions. This allows a Hydra build task to specify all the dependencies needed to build or test a project. It supports a number of operating systems, such as various GNU/Linux flavours, Mac OS X, and Windows.
-
“tees” all TCP traffic from one server to another. “widely used by companies in China”!
(tags: testing benchmarking performance tcp ip tcpcopy tee china regression-testing stress-testing ops)
Managing private Nix packages outside the Nixpkgs tree
Useful for private-repo Nix usage
Top 10 AWS Security Best Practices: #6 – Rotate all the Keys Regularly
Good doc on how to perform key rotation in AWS
[Nix-dev] Pulling a programs source code from a git repo
Nix supports building from git sha. excellent
Transparent huge pages implicated in Redis OOM
A nasty real-world prod error scenario worsened by THPs:
jemalloc(3) extensively uses madvise(2) to notify the operating system that it’s done with a range of memory which it had previously malloc’ed. The page size on this machine is 2MB because transparent huge pages are in use. As such, a lot of the memory which is being marked with madvise(…, MADV_DONTNEED) is within substantially smaller ranges than 2MB. This means that the operating system never was able to evict pages which had ranges marked as MADV_DONTNEED because the entire page has to be unneeded to allow a page to be reused. Despite initially looking like a leak, the operating system itself was unable to free memory because of madvise(2) and transparent huge pages. This led to sustained memory pressure on the machine and redis-server eventually getting OOM killed.
(tags: oom-killer oom linux ops thp jemalloc huge-pages madvise redis memory)
AllCrypt hacked, via PHP, WordPress, and the marketing director’s email
critical flaw: gaining access to the MySQL db let the attacker manipulate account balances. oh dear
-
‘inspires kids to explore and learn about science, engineering, and technology—and have fun doing it. Every month, a new crate to help kids develop a tinkering mindset and creative problem solving skills.’ aimed at ages 9-14+
(tags: kids gifts tinkering stem education fun engineering science toys)
-
Some nice performance tricks; I particularly like the use of sljit:
Ag uses Pthreads to take advantage of multiple CPU cores and search files in parallel. Files are mmap()ed instead of read into a buffer. Literal string searching uses Boyer-Moore strstr. Regex searching uses PCRE’s JIT compiler (if Ag is built with PCRE >=8.21). Ag calls pcre_study() before executing the same regex on every file. Instead of calling fnmatch() on every pattern in your ignore files, non-regex patterns are loaded into arrays and binary searched.
(tags: jit cli grep search ack ag unix pcre sljit boyer-moore tools)
Richard Stallman’s GNU Manifesto Turns Thirty
nice New Yorker profile of rms
-
Thought-provoking article looking back to John Perry Barlow’s “A Declaration of the Independence of Cyberspace”, published in 1996:
Barlow once wrote that “trusting the government with your privacy is like having a Peeping Tom install your window blinds.” But the Barlovian focus on government overreach leaves its author and other libertarians blind to the same encroachments on our autonomy from the private sector. The bold and romantic techno-utopian ideals of “A Declaration” no longer need to be fought for, because they’re already gone.
(tags: john-perry-barlow 1990s history cyberspace internet surveillance privacy data-protection libertarianism utopian manifestos)
The Terrible Technical Interview
TechCrunch, very down on the traditional big-O-and-whiteboard tech interview. See also https://news.ycombinator.com/item?id=9243169 for some good comments at HN. To be honest I think a good comprehension of data structures and big-O is pretty vital though….
(tags: interviewing jobs management hr hiring techcrunch)
The myopia boom seems to be due to spending too much time indoors
via Tony Finch
(tags: eyes health neuroscience science vision nature myopia short-sightedness)
-
Some neat new features for Mark Fletcher’s mailing-lists-as-a-service site: Markdown support, manageable archives (GREAT feature!), subgroups, calendars, files and wiki.
(tags: wiki email mailman mailing-lists mlm markdown mark-fletcher groups.io collaboration)
Stairs to nowhere, trap streets, and other Toronto oddities
‘There’s a set of stairs on Greenwood Avenue that lead nowhere. At the top, a wooden fence at the end of someone’s back yard blocks any further movement, forcing the climber to turn around and descend back to the street. What’s remarkable about the pointless Greenwood stairs, which were built in 1959 as a shortcut to a now-demolished brickyard, is that someone still routinely maintains them: in winter, some kindly soul deposits a scattering of salt lest one of the stairs’ phantom users slip; in summer someone comes with a broom to sweep away leaves. These urban leftovers are lovingly called “Thomassons” after Gary Thomasson, a former slugger for the San Francisco Giants, Oakland As, Yankees, Dodgers, and, most fatefully, the Yomiuri Giants in Tokyo.’
(tags: trap-streets maps ip google via:bldgblog mapping copyright thomassons orphaned-roads)
President’s message gets lost in (automated) translation
In a series of bizarre translations, YouTube’s automated translation service took artistic licence with the [President’s] words of warmth. When the head of state sent St Patrick’s Day greetings to viewers, the video sharing site said US comedian Tina Fey was being “particular with me head”. As President Higgins spoke of his admiration for Irish emigrants starting new communities abroad, YouTube said the President referenced blackjack and how he “just couldn’t put the new iPhone” down. And, in perhaps the most unusual moment, as he talked of people whose hearts have sympathy, the President “explained” he was once on a show “that will bar a gift card”.
(via Daragh O’Brien)(tags: lol president ireland michael-d-higgins automation translation machine-learning via:daraghobrien funny blackjack iphone tina-fey st-patrick fail)
Irish government under fire for turning its back on basic research : Nature News & Comment
Pretty much ALL of Ireland’s research scientists have put their names to an open letter to the Irish government, decrying the state of science funding, published this week in “Nature”. ‘Although total spending on research and development grew through the recession, helped by foreign investments, Ireland’s government has cut state spending on research (see ‘Celtic tiger tamed’). It also prioritized grants in 14 narrow areas — ones in which either large global markets exist, or in which Irish companies are competitive. These include marine renewable energy, smart grids, medical devices and computing. The effect has been to asphyxiate the many areas of fundamental science — including astrophysics, particle physics and areas of the life sciences — that have been deprived of funding, several researchers in Ireland told Nature. “The current policies are having a very significant detrimental effect on the health and viability of the Irish scientific ecosystem,” says Kevin Mitchell, a geneticist who studies the basis of neurological disorders at Trinity College Dublin. “Research that cannot be shoehorned into one of the 14 prioritized areas has been ineligible for most funding,” he says.’ That’s another fine mess Sean Sherlock has gotten us into :(
(tags: sean-sherlock fail ireland research government funding grants science tcd kevin-mitchell life-sciences nature)
Mars One finalist Dr. Joseph Roche rips into the project
So, here are the facts as we understand them: Mars One has almost no money. Mars One has no contracts with private aerospace suppliers who are building technology for future deep-space missions. Mars One has no TV production partner. Mars One has no publicly known investment partnerships with major brands. Mars One has no plans for a training facility where its candidates would prepare themselves. Mars One’s candidates have been vetted by a single person, in a 10-minute Skype interview. “My nightmare about it is that people continue to support it and give it money and attention, and it then gets to the point where it inevitably falls on its face,” said Roche. If, as a result, “people lose faith in NASA and possibly even in scientists, then that’s the polar opposite of what I’m about. If I was somehow linked to something that could do damage to the public perception of science, that is my nightmare scenario.”
(tags: science space mars-one tcd joseph-roche nasa mars exploration scams)
Stu Hood and Brian Degenhardt, Scala at Twitter, SF Scala @Twitter 20150217
‘Stu Hood and Brian Degenhardt talk about the history of Scala at Twitter, from inception until today, covering 2.10 migration, the original Alex Payne’s presentation from way back, pants, and more. The first five years of Scala at Twitter and the years ahead!’ Very positive indeed on the monorepo concept.
(tags: monorepo talks scala sfscala stu-hood twitter pants history repos build projects compilation gradle maven sbt)
demonstration of the importance of server-side request timeouts
from MongoDB, but similar issues often apply in many other TCP/HTTP-based systems
(tags: tcp http requests timeout mongodb reliability safety)
-
an open source stream processing software system developed by Mozilla. Heka is a “Swiss Army Knife” type tool for data processing, useful for a wide variety of different tasks, such as: Loading and parsing log files from a file system. Accepting statsd type metrics data for aggregation and forwarding to upstream time series data stores such as graphite or InfluxDB. Launching external processes to gather operational data from the local system. Performing real time analysis, graphing, and anomaly detection on any data flowing through the Heka pipeline. Shipping data from one location to another via the use of an external transport (such as AMQP) or directly (via TCP). Delivering processed data to one or more persistent data stores.
Via feylya on twitter. Looks potentially nifty(tags: heka mozilla monitoring metrics via:feylya ops statsd graphite stream-processing)
Real World Crypto 2015: Password Hashing according to Facebook
Very interesting walkthrough of how Facebook hash user passwords, including years of accreted practices
(tags: facebook passwords authentication legacy web security)
-
My account got hacked, running up over $600 in charges. Here’s the conclusion after running through the Sony support gauntlet. They can only refund up to $150. I can dispute the charges with my bank, but that will result in my account being banned. I cannot unban my account, and will thus lose my purchases (“but you only have the Last of Us and some of our free games, so it’s not a big deal”) Whomever hacked my account deactivated my PS4, and activated their own. Customer support will only permit one activation every 6 months. I’m locked out of logging into my own account on my PS4 for six months.
(tags: games sony psn playstation fail ps4 hacking security customer-support horror-stories)
Goodbye MongoDB, Hello PostgreSQL
Another core problem we’ve faced is one of the fundamental features of MongoDB (or any other schemaless storage engine): the lack of a schema. The lack of a schema may sound interesting, and in some cases it can certainly have its benefits. However, for many the usage of a schemaless storage engine leads to the problem of implicit schemas. These schemas aren’t defined by your storage engine but instead are defined based on application behaviour and expectations.
Well, don’t say we didn’t warn you ;)(tags: mongodb mysql postgresql databases storage schemas war-stories)
Apple Appstore STATUS_CODE_ERROR causes worldwide service problems
Particularly notable for this horrific misfeature, noted by jgc:
I can’t commit code at CloudFlare because we use two-factor auth for the VPN (and everything else) and non-Apple apps on my iPhone are asking for my iTunes password. Tried airplane mode and apps simply don’t load at all!
That is a _disastrous_ policy choice by Apple. Does this mean Apple can shut down third-party app operation on iOS devices worldwide should they feel like it?(tags: 2fa authy apps ios apple ownership itunes outages appstore fail jgc)
Correcting YCSB’s Coordinated Omission problem
excellent walkthrough of CO and how it affects Yahoo!’s Cloud Storage Benchmarking platform
(tags: coordinated-omission co yahoo ycsb benchmarks performance testing)
Backblaze Vaults: Zettabyte-Scale Cloud Storage Architecture
Backblaze deliver their take on nearline storage: ‘Backblaze’s cloud storage Vaults deliver 99.99999% annual durability, horizontal scalability, and 20 Gbps of per-Vault performance, while being operationally efficient and extremely cost effective. Driven from the same mindset that we brought to the storage market with Backblaze Storage Pods, Backblaze Vaults continue our singular focus of building the most cost-efficient cloud storage around.’
(tags: architecture backup storage backblaze nearline offline reed-solomon error-correction)
Ireland accused of weakening data rules
Privacy campaign group Lobbyplag puts Ireland one of top three offenders in pushing for changes to EU privacy law
(tags: privacy data-protection lobbyplag ireland eu germany lobbying)
-
the stock-photo counterpart to “Women Eating Salad” has been found
Can Spark Streaming survive Chaos Monkey?
good empirical results on Spark’s resilience to network/host outages in EC2
(tags: ec2 aws emr spark resilience ha fault-tolerance chaos-monkey netflix)
-
Concourse is a CI system composed of simple tools and ideas. It can express entire pipelines, integrating with arbitrary resources, or it can be used to execute one-off builds, either locally or in another CI system.
(tags: ci concourse-ci build deployment continuous-integration continuous-deployment devops)
Epsilon Interactive breach the Fukushima of the Email Industry (CAUCE)
Upon gaining access to an ESP, the criminals then steal subscriber data (PII such as names, addresses, telephone numbers and email addresses, and in one case, Vehicle Identification Numbers). They then use ESPs’ mailing facility to send spam; to monetize their illicit acquisition, the criminals have spammed ads for fake Adobe Acrobat and Skype software. On March 30, the Epsilon Interactive division of Alliance Data Marketing (ADS on NASDAQ) suffered a massive breach that upped the ante, substantially. Email lists of at least eight financial institutions were stolen. Thus far, puzzlingly, Epsilon has refused to release the names of compromised clients. […] The obvious issue at hand is the ability of the thieves to now undertake targeted spear-phishing problem as critically serious as it could possibly be.
(tags: cauce epsilon-interactive esp email pii data-protection spear-phishing phishing identity-theft security ads)
In Ukraine, Tomorrow’s Drone War Is Alive Today
Drones, hackerspaces and crowdfunding:
The most sophisticated UAV that has come out of the Ukrainian side since the start of the conflict is called the PD-1 from developer Igor Korolenko. It has a wingspan of nearly 10 feet, a five-hour flight time, carries electro-optical and infrared sensors as well as a video camera that broadcasts on a 128 bit encrypted channel. Its most important feature is the autopilot software that allows the drone to return home in the event that the global positioning system link is jammed or lost. Drone-based intelligence gathering is often depicted as risk-free compared to manned aircraft or human intelligence gathering, but, says Korolenko, if the drone isn’t secure or the signature is too obvious, the human coasts can be very, very high. “Russian military sometimes track locations of ground control stations,” he wrote Defense One in an email. “Therefore UAV squads have to follow certain security measures – to relocate frequently, to move out antennas and work from shelter, etc. As far as I know, two members of UAV squads were killed from mortar attacks after [their] positions were tracked by Russian electronic warfare equipment.”
(via bldgblog)(tags: via:bldgblog war drones uav future ukraine russia tech aircraft pd-1 crowdfunding)
-
a 303 and an 808 in your browser. this is deadly
Ubuntu To Officially Switch To systemd Next Monday – Slashdot
Jesus. This is going to be the biggest shitfest in the history of Linux…
-
A project to reduce systemd to a base initd, process supervisor and transactional dependency system, while minimizing intrusiveness and isolationism. Basically, it’s systemd with the superfluous stuff cut out, a (relatively) coherent idea of what it wants to be, support for non-glibc platforms and an approach that aims to minimize complicated design. uselessd is still in its early stages and it is not recommended for regular use or system integration.
This may be the best option to evade the horrors of systemd. Japan’s Robot Dogs Get Funerals as Sony Looks Away
in July 2014, [Sony’s] repairs [of Aibo robot dogs] stopped and owners were left to look elsewhere for help. The Sony stiff has led not only to the formation of support groups–where Aibo enthusiasts can share tips and help each other with repairs–but has fed the bionic pet vet industry. “The people who have them feel their presence and personality,” Nobuyuki Narimatsu, director of A-Fun, a repair company for robot dogs, told AFP. “So we think that somehow, they really have souls.” While concerted repair efforts have kept many an Aibo alive, a shortage of spare parts means that some of their lives have come to an end.
(tags: sony aibo robots japan dogs pets weird future badiotday iot gadgets)
“Cuckoo Filter: Practically Better Than Bloom”
‘We propose a new data structure called the cuckoo filter that can replace Bloom filters for approximate set membership tests. Cuckoo filters support adding and removing items dynamically while achieving even higher performance than Bloom filters. For applications that store many items and target moderately low false positive rates, cuckoo filters have lower space overhead than space-optimized Bloom filters. Our experimental results also show that cuckoo filters outperform previous data structures that extend Bloom filters to support deletions substantially in both time and space.’
(tags: algorithms paper bloom-filters cuckoo-filters cuckoo-hashing data-structures false-positives big-data probabilistic hashing set-membership approximation)
Amazing cutting from Vanity Fair, 1896, for International Women’s Day
“The sisters make a pretty picture on the platform ; but it is not women of their type who need to assert themselves over Man. However, it amuses them–and others ; and I doubt if the tyrant has much to fear from their little arrows.” Constance Markievicz was one of those sisters, and the other was Eva Gore-Booth.
(tags: markievicz history ireland sligo vanity-fair 19th-century dismissal sexism iwd women)
-
Authy doesn’t come off well here: ‘Authy should have been harder to break. It’s an app, like Authenticator, and it never left Davis’ phone. But Eve simply reset the app on her phone using a mail.com address and a new confirmation code, again sent by a voice call. A few minutes after 3AM, the Authy account moved under Eve’s control.’
(tags: authy security hacking mfa authentication google apps exploits)
Ask the Decoder: Did I sign up for a global sleep study?
How meaningful is this corporate data science, anyway? Given the tech-savvy people in the Bay Area, Jawbone likely had a very dense sample of Jawbone wearers to draw from for its Napa earthquake analysis. That allowed it to look at proximity to the epicenter of the earthquake from location information. Jawbone boasts its sample population of roughly “1 million Up wearers who track their sleep using Up by Jawbone.” But when looking into patterns county by county in the U.S., Jawbone states, it takes certain statistical liberties to show granularity while accounting for places where there may not be many Jawbone users. So while Jawbone data can show us interesting things about sleep patterns across a very large population, we have to remember how selective that population is. Jawbone wearers are people who can afford a $129 wearable fitness gadget and the smartphone or computer to interact with the output from the device. Jawbone is sharing what it learns with the public, but think of all the public health interests or other third parties that might be interested in other research questions from a large scale data set. Yet this data is not collected with scientific processes and controls and is not treated with the rigor and scrutiny that a scientific study requires. Jawbone and other fitness trackers don’t give us the option to use their devices while opting out of contributing to the anonymous data sets they publish. Maybe that ought to change.
(tags: jawbone privacy data-protection anonymization aggregation data medicine health earthquakes statistics iot wearables)
Pinterest’s highly-available configuration service
Stored on S3, update notifications pushed to clients via Zookeeper
A Journey into Microservices | Hailo Tech Blog
Excellent three-parter from Hailo, describing their RabbitMQ+Go-based microservices architecture. Very impressive!
(tags: hailo go microservices rabbitmq amqp architecture blogs)
-
The Large Hadron Migrator is a tool to perform live database migrations in a Rails app without locking.
The basic idea is to perform the migration online while the system is live, without locking the table. In contrast to OAK and the facebook tool, we only use a copy table and triggers. The Large Hadron is a test driven Ruby solution which can easily be dropped into an ActiveRecord or DataMapper migration. It presumes a single auto incremented numerical primary key called id as per the Rails convention. Unlike the twitter solution, it does not require the presence of an indexed updated_at column.
(tags: migrations database sql ops mysql rails ruby lhm soundcloud activerecord)
Biased Locking in HotSpot (David Dice’s Weblog)
This is pretty nuts. If biased locking in the HotSpot JVM is causing performance issues, it can be turned off:
You can avoid biased locking on a per-object basis by calling System.identityHashCode(o). If the object is already biased, assigning an identity hashCode will result in revocation, otherwise, the assignment of a hashCode() will make the object ineligible for subsequent biased locking.
(tags: hashcode jvm java biased-locking locking mutex synchronization locks performance)
A Zero-Administration Amazon Redshift Database Loader – AWS Big Data Blog
nifty!
Archie Markup Language (ArchieML)
ArchieML (or “AML”) was created at The New York Times to make it easier to write and edit structured text on deadline that could be rendered in web pages, or more specifically, rendered in interactive graphics. One of the main goals was to make it easy to tag text as data, without having type a lot of special characters. Another goal was to allow the document to contain lots of notes and draft text that would not be read into the data. And finally, because we make extensive use of Google Documents’s concurrent-editing features — while working on a graphic, we can have several reporters, editors and developers all pouring information into a single document — we wanted to have a format that could survive being edited by users who may never have seen ArchieML or any other markup language at all before.
California Says Motorcycle Lane-Splitting Is Hella Safe
A recent yearlong study by the California Office of Traffic Safety has found motorcycle lane-splitting to be a safe practice on public roads. The study looked at collisions involving 7836 motorcyclists reported by 80 police departments between August 2012 and August 2013. “What we learned is, if you lane-split in a safe or prudent manner, it is no more dangerous than motorcycling in any other circumstance,” state spokesman Chris Cochran told the Sacramento Bee. “If you are speeding or have a wide speed differential (with other traffic), that is where the fatalities came about.”
(tags: lane-splitting cycling motorcycling bikes road-safety driving safety california)
-
Good terminology for this concept:
The try server runs a similar configuration to the continuous integration server, except that it is triggered not on commits but on “try job request”, in order to test code pre-commit.
See also https://wiki.mozilla.org/ReleaseEngineering/TryServer for the Moz take on it.(tags: build ci integration try-server jenkins buildbot chromium development)
-
A Dropwizard Metrics extension to instrument JDBC resources and measure SQL execution times.
(tags: metrics sql jdbc instrumentation dropwizard)
HP is trying to patent Continuous Delivery
This is appalling bollocks from HP:
On 1st March 2015 I discovered that in 2012 HP had filed a patent (WO2014027990) with the USPO for ‘Performance tests in a continuous deployment pipeline‘ (the patent was granted in 2014). [….] HP has filed several patents covering standard Continuous Delivery (CD) practices. You can help to have these patents revoked by providing ‘prior art’ examples on Stack Exchange.
In fairness, though, this kind of shit happens in most big tech companies. This is what happens when you have a broken software patenting system, with big rewards for companies who obtain shitty troll patents like these, and in turn have companies who reward the engineers who sell themselves out to write up concepts which they know have prior art. Software patents are broken by design!(tags: cd devops hp continuous-deployment testing deployment performance patents swpats prior-art)
Exponential Backoff And Jitter
Great go-to explainer blog post for this key distributed-systems reliability concept, from the always-solid Marc Brooker
(tags: marc-brooker distsys networking backoff exponential jitter retrying retries reliability occ)
17 Things Everyone Must Eat In Dublin
actually a fairly sane list of lunchy options — the SMS fish finger butty is a lunch staple for us Swrvers
VividCortex uses K-Means Clustering to discover related metrics
After selecting an interesting spike in a metric, the algorithm can automate picking out a selection of other metrics which spiked at the same time. I can see that being pretty damn useful
(tags: metrics k-means-clustering clustering algorithms discovery similarity vividcortex analysis data)
Alibaba’s cloud service launches in US, wants to rain all over Amazon
server-hosting only for now. Interesting!
Alibaba’s cloud platform already competes with the likes of AWS in China. Aliyun’s Chinese data centers are in Beijing, Hangzhou, Qingdao, Hong Kong, and Shenzhen. “For the time being, we are just testing the water,” Yu said today. That means Aliyun will focus first on Chinese companies doing business in the US. “We know well what Chinese clients need, and now it’s time for us to learn what US clients need,” he added.
-
the following guidelines maximize bandwidth usage: Optimizing the sizes of the file parts, whether they are part of a large file or an entire small file; Optimizing the number of parts transferred concurrently. Tuning these two parameters achieves the best possible transfer speeds to [S3].
-
Excellent web-based ASCII-art editor (via Craig)
(tags: via:craig design ascii diagrams editor ascii-art art asciiflow drawing)
Services Engineering Reading List
good list of papers/articles for fans of scalability etc.
(tags: architecture papers reading reliability scalability articles to-read)
-
nice, free-during-beta Mac app to draw ASCII-art diagrams
-
“Open source APM for Java” — profiling in production, with a demo benchmark showing about a 2% performance impact. Wonder about effects on memory/GC, though
(tags: apm java metrics measurement new-relic profiling glowroot)
“Everything you’ve ever said to Siri/Cortana has been recorded…and I get to listen to it”
This should be a reminder.
At first, I though these sound bites were completely random. Then I began to notice a pattern. Soon, I realized that I was hearing peoples commands given to their mobile devices. Guys, I’m telling you, if you’ve said it to your phone, it’s been recorded…and there’s a damn good chance a 3rd party is going to hear it.
(tags: privacy google siri cortana android voice-recognition outsourcing mobile)
-
Fantastic 1997-era book of interviews with the programmers behind some of the greatest games in retrogaming history:
Halcyon Days: Interviews with Classic Computer and Video Game Programmers was released as a commercial product in March 1997. At the time it was one of the first retrogaming projects to focus on lost history rather than game collecting, and certainly the first entirely devoted to the game authors themselves. Now a good number of the interviewees have their own web sites, but none of them did when I started contacting them in 1995. […] If you have any of the giddy anticipation that I did whenever I picked up a magazine containing an interview with Mark Turmell or Dan [M.U.L.E.] Bunten, then you want to start reading.
(tags: book games history coding interviews via:walter)
Pub Table Quiz – In Aid of Digital Rights Ireland
Jason Roe is organising a Table Quiz in Dublin on March 26th to support fundraising efforts by Digital Rights Ireland. We will supply tables, questions and a ready supply of beer and maybe finger food.
Why are transhumanists such dicks?
Good discussion from a transhumanist forum (via Boing Boing):
“I’ve been around and interviewed quite a lot of self-identified transhumanists in the last couple of years, and I’ve noticed many of them express a fairly stark ideology that is at best libertarian, and at worst Randian. Very much “I want super bionic limbs and screw the rest of the world”. They tend to brush aside the ethical, environmental, social and political ramifications of human augmentation so long as they get to have their toys. There’s also a common expression that if sections of society are harmed by transhumanist progress, then it is unfortunate but necessary for the greater good (the greater good often being bestowed primarily upon those endorsing the transhumanism). That attitude isn’t prevalent on this forum at all – I think the site tends to attract more practical body-modders than theoretical transhumanists – but I wondered if anyone else here had experienced the same attitudes in their own circles? What do you make of it?”
(tags: transhumanism evolution body-modding surgery philosophy via:boingboing libertarianism society politics)
Release Protocol Buffers v3.0.0-alpha-2 · google/protobuf
New major-version track for protobuf, with some interesting new features: Removal of field presence logic for primitive value fields, removal of required fields, and removal of default values. This makes proto3 significantly easier to implement with open struct representations, as in languages like Android Java, Objective C, or Go. Removal of unknown fields. Removal of extensions, which are instead replaced by a new standard type called Any. Fix semantics for unknown enum values. Addition of maps. Addition of a small set of standard types for representation of time, dynamic data, etc. A well-defined encoding in JSON as an alternative to binary proto encoding.
(tags: protobuf binary marshalling serialization google grpc proto3 coding open-source)
RIPQ: Advanced photo caching on flash for Facebook
Interesting priority-queue algorithm optimised for caching data on SSD
(tags: priority-queue algorithms facebook ssd flash caching ripq papers)
-
Performance-diagnosis-as-a-service. Cool.
Users download and install an Illuminate Daemon using a simple installer which starts up a small stand alone Java process. The Daemon sits quietly unless it is asked to start gathering SLA data and/or to trigger a diagnosis. Users can set SLA’s via the dashboard and can opt to collect latency measurements of their transactions manually (using our library) or by asking Illuminate to automatically instrument their code (Servlet and JDBC based transactions are currently supported). SLA latency data for transactions is collected on a short cycle. When the moving average of latency measurements goes above the SLA value (e.g. 150ms), a diagnosis is triggered. The diagnosis is very quick, gathering key data from O/S, JVM(s), virtualisation and other areas of the system. The data is then run through the machine learned algorithm which will quickly narrow down the possible causes and gather a little extra data if needed. Once Illuminate has determined the root cause of the performance problem, the diagnosis report is sent back to the dashboard and an alert is sent to the user. That alert contains a link to the result of the diagnosis which the user can share with colleagues. Illuminate has all sorts of backoff strategies to ensure that users don’t get too many alerts of the same type in rapid succession!
(tags: illuminate jclarity java jvm scala latency gc tuning performance)
-
Binary message marshalling, client/server stubs generated by an IDL compiler, bidirectional binary protocol. CORBA is back from the dead! Intro blog post: http://googledevelopers.blogspot.ie/2015/02/introducing-grpc-new-open-source-http2.html Relevant: Steve Vinoski’s commentary on protobuf-rpc back in 2008: http://steve.vinoski.net/blog/2008/07/13/protocol-buffers-leaky-rpc/
(tags: http rpc http2 netty grpc google corba idl messaging)
Bloom Cookies: web search personalization without user tracking
Interesting paper
(tags: bloom-cookies bloom-filters data-structures cookies privacy personalization user-tracking http)
Why we run an open source program – Walmart Labs
This is a great exposition of why it’s in a company’s interest to engage with open source. Not sure I agree with ‘engineers are the artists of our generation’ but the rest are spot on
(tags: development open-source walmart node coding via:hn hiring)
-
MQTT definitely has a smaller size on the wire. It’s also simpler to parse (let’s face it, Huffman isn’t that easy to implement) and provides guaranteed delivery to cater to shaky wireless networks. On the other hand, it’s also not terribly extensible. There aren’t a whole lot of headers and options available, and there’s no way to make custom ones without touching the payload of the message. It seems that HTTP/2 could definitely serve as a reasonable replacement for MQTT. It’s reasonably small, supports multiple paradigms (pub/sub & request/response) and is extensible. Its also supported by the IETF (whereas MQTT is hosted by OASIS). From conversations I’ve had with industry leaders in the embedded software and chip manufacturing, they only want to support standards from the IETF. Many of them are still planning to support MQTT, but they’re not happy about it. I think MQTT is better at many of the things it was designed for, but I’m interested to see over time if those advantages are enough to outweigh the benefits of HTTP. Regardless, MQTT has been gaining a lot of traction in the past year or two, so you may be forced into using it while HTTP/2 catches up.
(tags: http2 mqtt iot pub-sub protocols ietf embedded push http)
Automatically Deploy from GitHub Using AWS CodeDeploy – Application Management Blog
I like this
(tags: github aws ec2 codedeploy deployment ops)
Programmer IS A Career Path, Thank You
Well said — Amazon had a good story around this btw
(tags: programming coding career work life)
how Curator fixed issues with the Hive ZooKeeper Lock Manager Implementation
Ugh, ZK is a bear to work with.
Apache Curator is open source software which is able to handle all of the above scenarios transparently. Curator is a Netflix ZooKeeper Library and it provides a high-level API, CuratorFramework, that simplifies using ZooKeeper. By using a singleton CuratorFramework instance in the new ZooKeeperHiveLockManager implementation, we not only fixed the ZooKeeper connection issues, but also made the code easy to understand and maintain.
(tags: zookeeper apis curator netflix distributed-locks coding hive)
Advanced cryptographic ratcheting
Forward secrecy and in-session key “ratcheting”
(tags: crypto privacy key-management forward-secrecy pfs key-ratcheting key-rotation)
-
What a mess.
What’s faster: PV, HVM, HVM with PV drivers, PVHVM, or PVH? Cloud computing providers using Xen can offer different virtualization “modes”, based on paravirtualization (PV), hardware virtual machine (HVM), or a hybrid of them. As a customer, you may be required to choose one of these. So, which one?
(tags: ec2 linux performance aws ops pv hvm xen virtualization)
Proving that Android’s, Java’s and Python’s sorting algorithm is broken (and showing how to fix it)
Wow, this is excellent work. A formal verification of Tim Peters’ TimSort failed, resulting in a bugfix:
While attempting to verify TimSort, we failed to establish its instance invariant. Analysing the reason, we discovered a bug in TimSort’s implementation leading to an ArrayOutOfBoundsException for certain inputs. We suggested a proper fix for the culprit method (without losing measurable performance) and we have formally proven that the fix actually is correct and that this bug no longer persists.
(tags: timsort algorithms android java python sorting formal-methods proofs openjdk)
-
“Cheap SSL certs from $4.99/yr” — apparently recommended for cheap, low-end SSL certs
-
Erasure codes, such as Reed-Solomon (RS) codes, are increasingly being deployed as an alternative to data-replication for fault tolerance in distributed storage systems. While RS codes provide significant savings in storage space, they can impose a huge burden on the I/O and network resources when reconstructing failed or otherwise unavailable data. A recent class of erasure codes, called minimum-storage-regeneration (MSR) codes, has emerged as a superior alternative to the popular RS codes, in that it minimizes network transfers during reconstruction while also being optimal with respect to storage and reliability. However, existing practical MSR codes do not address the increasingly important problem of I/O overhead incurred during reconstructions, and are, in general, inferior to RS codes in this regard. In this paper, we design erasure codes that are simultaneously optimal in terms of I/O, storage, and network bandwidth. Our design builds on top of a class of powerful practical codes, called the product-matrix-MSR codes. Evaluations show that our proposed design results in a significant reduction the number of I/Os consumed during reconstructions (a 5 reduction for typical parameters), while retaining optimality with respect to storage, reliability, and network bandwidth.
(tags: erasure-coding reed-solomon compression reliability reconstruction replication fault-tolerance storage bandwidth usenix papers)
Everyday I’m Shuffling – Tips for Writing Better Spark Programs [slides]
Two Spark experts from Databricks provide some good tips
Cowen went golfing and officials dithered as country burned in 2008 – Independent.ie
Lest we forget, the sheer bullshitting ineptitude of Fianna Fail as they managed to shamble into destroying Ireland’s economy in 2008:
Once that nasty bit of business was done, the Cabinet departed en masse for six weeks on their summer holidays, despite the emerging economic and financial tsunami. Cowen and family famously took up residence in a caravan park in Connemara as opposed to his ‘official’ residence at the Mannin Bay Hotel nearby. When pressed by our reporter Niamh Horan as to why he was not at his station, he defensively replied: “I don’t understand it. First the media have a go at me because I’m taking a holiday with my family and then they come down to see if I’m having a good time!” he exclaimed.
(tags: 2008 meltdown ireland brian-cowen connemara politics history fianna-fail)
How I Became A Minor Celebrity In China (After My Stolen Phone Ended Up There)
Phone is stolen, shipped to China, and winds up being bought by “Brother Orange” — then the story becomes China’s biggest viral hit
-
40 minutes of multi-zone network outage for majority of instances. ‘The internal software system which programs GCE’s virtual network for VM egress traffic stopped issuing updated routing information. The cause of this interruption is still under active investigation. Cached route information provided a defense in depth against missing updates, but GCE VM egress traffic started to be dropped as the cached routes expired.’ I wonder if Google Pimms fired the alarms for this ;)
(tags: google outages gce networking routing pimms multi-az cloud)
Listen to a song made from data lost during MP3 conversion
Ryan McGuire, a PhD student in Composition and Computer Technologies at the University of Virginia Center for Computer Music, has created the project The Ghost In The MP3 [….] For his first trick, McGuire took Suzanne Vega’s ‘Tom’s Diner’ and drained it into a vaporous piece titled ‘moDernisT.” McGuire chose the track he explains on his site because it was famously used as one of the main controls in the listening tests used to develop the MP3 algorithm.
(tags: mp3 music suzanne-vega compression)