‘Addressing the rebalancing problem in bike-sharing systems’ [paper]
Many of the bike-sharing systems introduced around the world in the past 15 years have the same problem: Riders tend to take some routes and not others. As a result, the bikes tend to collect in a few places, which is a drag for users and a costly problem for the operators, who “rebalance” the system using trucks that take bikes from full stations to empty ones. Now, scientists are coming up with special algorithms to improve this process. One of them, developed by scientists at the Vienna University of Technology and the Austrian Institute of Technology, is now being tested in Vienna’s bike-sharing system; another, developed at Cornell University, is already in use in New York City.
Timely — here’s what Dublin Bikes looked like this morning: https://twitter.com/jmason/status/503828246086295552 (via Andrew Caines)(tags: cycling bike-sharing borisbikes dublinbikes rebalancing fleet availability optimization maths papers toread algorithms)
‘Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services’ [paper]
We proposed the JIQ algorithms for web server farms that are dynamically scalable. The JIQ algorithms significantly outperform the state-of-the-art SQ(d) algorithm in terms of response time at the servers, while incurring no communication overhead on the critical path. The overall complexity of JIQ is no greater than that of SQ(d). The extension of the JIQ algorithms proves to be useful at very high load. It will be interesting to acquire a better understanding of the algorithm with a varying reporting threshold. We would also like to understand better the relationship of the reporting frequency to response times, as well as an algorithm to further reduce the complexity of the JIQ-SQ(2) algorithm while maintaining its superior performance.
(tags: join-idle-queue algorithms scheduling load-balancing via:norman-maurer jiq microsoft load-balancers performance)
3 Rules of thumb for Bloom Filters
I often need to do rough back-of-the-envelope reasoning about things, and I find that doing a bit of work to develop an intuition for how a new technique performs is usually worthwhile. So, here are three broad rules of thumb to remember when discussing Bloom filters down the pub: One byte per item in the input set gives about a 2% false positive rate. The optimal number of hash functions is about 0.7 times the number of bits per item. 3 – The number of hashes dominates performance.
(tags: bloom-filters algorithm probabilistic rules reasoning via:norman-maurer false-positives hashing coding)
Logentries Announces Machine Learning Analytics for IT Ops Monitoring and Real-time Alerting
This sounds pretty neat:
With Logentries Anomaly Detection, users can: Set-up real-time alerting based on deviations from important patterns and log events. Easily customize Anomaly thresholds and compare different time periods. With Logentries Inactivity Alerting, users can: Monitor standard, incoming events such as an application heart beat. Receive real-time alerts based on log inactivity (i.e. receive alerts when something does not occur).
(tags: logging syslog logentries anomaly-detection ops machine-learning inactivity alarms alerting heartbeats)
A beginner’s guide to drills and bits – Boing Boing
This is actually quite educational
(tags: diy boing-boing drills bits tools construction)
Justin's Linklog Posts
-
Some vague details of the antispam system in use at Twitter.
The main challenges in supporting this type of system are evaluating rules with low enough latency that they can run on the write path for Twitter’s main features (i.e., Tweets, Retweets, favorites, follows and messages), supporting computationally intense machine learning based rules, and providing Twitter engineers with the ability to modify and create new rules instantaneously.
(tags: spam realtime scaling twitter anti-spam botmaker rules)
EcoJel jellyfish identification card
To identify the jellyfish found in Irish waters — good, recognisable photos
(tags: jellyfish identification ecojel ireland sea swimming safety id-cards)
DealExtreme are now selling a Google Cardboard kit
$10 with free shipping. You can’t go wrong!
The Double Identity of an “Anti-Semitic” Commenter
Hasbara out of control. This is utterly nuts.
His intricate campaign, which he has admitted to Common Dreams, included posting comments by a screen name, “JewishProgressive,” whose purpose was to draw attention to and denounce the anti-Semitic comments that he had written under many other screen names. The deception was many-layered. At one point he had one of his characters charge that the anti-Semitic comments and the criticism of the anti-Semitic comments must be written by “internet trolls who have been known to impersonate anti-Semites in order to then double-back and accuse others of supporting anti-Semitism”–exactly what he was doing.
(tags: hasbara israel trolls propaganda web racism comments anonymity commondreams)
WWN’S Guide To Abortion In Ireland
“Why are you still reading this? Go to England!” funny because it’s (horribly) true.
(tags: abortion ireland politics women rights wwn england ovaries rosaries religion)
Java tip: optimizing memory consumption
Good tips on how to tell if object allocation rate is a bottleneck in your JVM-based code
(tags: yourkit memory java jvm allocation gc bottlenecks performance)
-
The way that [problems with the PGP bootstrapping] are supposed to be resolved is with an authentication model called the Web of Trust where users sign keys of other users after verifying that they are who they say they are. In theory, if some due diligence is applied in signing other people’s keys and a sufficient number of people participate you’ll be able to follow a short chain of signatures from people you already know and trust to new untrusted keys you download from a key server. In practice this has never worked out very well as it burdens users with the task of manually finding people to sign their keys and even experts find the Web of Trust model difficult to reason about. This also reveals the social graph of certain communities which may place users at risk for their associations. Such signatures also reveal metadata about times and thus places for meetings for key signings. The Nyms Identity Directory is a replacement for all of this. Keyservers are replaced with an identity directory that gives users full control over publication of their key information and web of trust is replaced with a distributed network of trusted notaries which validate user keys with an email verification protocol.
(tags: web-of-trust directories nyms privacy crypto identity trust pgp gpg security via:ioerror keyservers notaries)
-
Frogsort as an exam question (via qwghlm)
(tags: via:qwghlm frogsort sorting big-o algorithms funny comics smbc)
Punished for Being Poor: Big Data in the Justice System
This is awful. Totally the wrong tool for the job — a false positive rate which is miniscule for something like spam filtering, could translate to a really horrible outcome for a human life.
Currently, over 20 states use data-crunching risk-assessment programs for sentencing decisions, usually consisting of proprietary software whose exact methods are unknown, to determine which individuals are most likely to re-offend. The Senate and House are also considering similar tools for federal sentencing. These data programs look at a variety of factors, many of them relatively static, like criminal and employment history, age, gender, education, finances, family background, and residence. Indiana, for example, uses the LSI-R, the legality of which was upheld by the state’s supreme court in 2010. Other states use a model called COMPAS, which uses many of the same variables as LSI-R and even includes high school grades. Others are currently considering the practice as a way to reduce the number of inmates and ensure public safety. (Many more states use or endorse similar assessments when sentencing sex offenders, and the programs have been used in parole hearings for years.) Even the American Law Institute has embraced the practice, adding it to the Model Penal Code, attesting to the tool’s legitimacy.
(via stroan)(tags: via:stroan statistics false-positives big-data law law-enforcement penal-code risk sentencing)
Microservices – Not a free lunch! – High Scalability
Some good reasons not to adopt microservices blindly. Testability and distributed-systems complexity are my biggest fears
(tags: microservices soa devops architecture testing distcomp)
Richard Clayton – Failing at Microservices
Solid warts-and-all confessional blogpost about a team failing to implement a microservices architecture. I’d put most of the blame on insufficient infrastructure to support them (at a code level), inter-personal team problems, and inexperience with large-scale complex multi-service production deployment and the work it was going to require
(tags: microservices devops collaboration architecture fail team deployment soa)
Box Tech Blog » A Tale of Postmortems
How Box introduced COE-style dev/ops outage postmortems, and got them working. This PIE metric sounds really useful to head off the dreaded “it’ll all have to come out missus” action item:
The picture was getting clearer, and we decided to look into individual postmortems and action items and see what was missing. As it was, action items were wasting away with no owners. Digging deeper, we noticed that many action items entailed massive refactorings or vague requirements like “make system X better” (i.e. tasks that realistically were unlikely to be addressed). At a higher level, postmortem discussions often devolved into theoretical debates without a clear outcome. We needed a way to lower and focus the postmortem bar and a better way to categorize our action items and our technical debt. Out of this need, PIE (“Probability of recurrence * Impact of recurrence * Ease of addressing”) was born. By ranking each factor from 1 (“low”) to 5 (“high”), PIE provided us with two critical improvements: 1. A way to police our postmortems discussions. I.e. a low probability, low impact, hard to implement solution was unlikely to get prioritized and was better suited to a discussion outside the context of the postmortem. Using this ranking helped deflect almost all theoretical discussions. 2. A straightforward way to prioritize our action items. What’s better is that once we embraced PIE, we also applied it to existing tech debt work. This was critical because we could now prioritize postmortem action items alongside existing work. Postmortem action items became part of normal operations just like any other high-priority work.
(tags: postmortems action-items outages ops devops pie metrics ranking refactoring prioritisation tech-debt)
NTP’s days are numbered for consumer devices
An accurate clock is required to negotiate SSL/TLS, so clock sync is important for internet-of-things usage. but:
Unfortunately for us, the traditional and most widespread method for clock synchronisation (NTP) has been caught up in a DDoS issue which has recently caused some ISPs to start blocking all NTP communication. [….] Because the DDoS attacks are so widespread, and the lack of obvious commercial pressure to fix the issue, it’s possible that the days of using NTP as a mechanism for setting clocks may well be numbered. Luckily for us there is a small but growing project that replaces it. tlsdate was started by Jacob Appelbaum of the Tor project in 2012, making use of the SSL handshake in order to extract time from a remote server, and its usage is on the rise. [….] Since we started encountering these problems, we’ve incorporated tlsdate into an over-the-air update, and have successfully started using this in situations where NTP is blocked.
(tags: tlsdate ntp clocks time sync iot via:gwire ddos isps internet protocols security)
Cloudwash – Creating the Technical Prototype
This is a lovely demo of integrating modern IoT connectivity functionality (remote app control, etc.) with a washing machine using Bergcloud’s hardware and backend, and a little logic-analyzer reverse engineering.
(tags: arduino diy washing-machines iot bergcloud hacking reversing logic-analyzers hardware)
Systemd: Harbinger of the Linux apocalypse
While there are many defensible aspects of Systemd, other aspects boggle the mind. Not the least of these was that, as of a few months ago, trying to debug the kernel from the boot line would cause the system to crash. This was because of Systemd’s voracious logging and the fact that Systemd responds to the “debug” flag on the kernel boot line — a flag meant for the kernel, not anything else. That, straight up, is a bug. However, the Systemd developers didn’t see it that way and actively fought with those experiencing the problem. Add the fact that one of the Systemd developers was banned by Linus Torvalds for poor attitude and bad design and another was responsible for causing significant issues with Linux audio support, but blamed the problem on everything else but his software, and you have a bad situation on your hands. There’s no shortage of egos in the open source development world. There’s no shortage of new ideas and veteran developers and administrators pooh-poohing something new simply because it’s new. But there are also 45 years of history behind Unix and extremely good reasons it’s still flourishing. Tools designed like Systemd do not fit the Linux mold, to their own detriment. Systemd’s design has more in common with Windows than with Unix — down to the binary logging.
The link re systemd consuming the “debug” kernel boot arg is a canonical example of inflexible coders refusing to fix their own bugs. (via Jason Dixon)(tags: systemd linux red-hat egos linus-torvalds unix init booting debugging logging design software via:obfuscurity)
-
The mining operation resides on an old, repurposed factory floor, and contains 2500 machines hashing away at 230 Gh/s, each. (That’s 230 billion calculations per second, per unit). […] The operators told me that the power bill of this specific operation is in excess of ¥400,000 per month [..] about $60,000 USD.
(tags: currency china economics bitcoin power environment green mining datacenters)
Moving Big Data into the Cloud with Tsunami UDP – AWS Big Data Blog
Pretty serious speedup. 81 MB/sec with Tsunami UDP, compared to 9 MB/sec with plain old scp. Probably kills internet performance for everyone else though!
(tags: tsunami-udp udp scp copying transfers internet long-distance performance speed)
-
Ha, great name. We use this (in the form of Smartstack).
For what it is worth, we faced a similar challenge in earlier services (mostly due to existing C/C++ applications) and we created what was called a “sidecar”. By sidecar, what I mean is a second process on each node/instance that did Cloud Service Fabric operations on behalf of the main process (the side-managed process). Unfortunately those sidecars all went off and created one-offs for their particular service. In this post, I’ll describe a more general sidecar that doesn’t force users to have these one-offs. Sidenote: For those not familiar with sidecars, think of the motorcycle sidecar below. Snoopy would be the main process with Woodstock being the sidecar process. The main work on the instance would be the motorcycle (say serving your users’ REST requests). The operational control is the sidecar (say serving health checks and management plane requests of the operational platform).
(tags: netflix sidecars architecture patterns smartstack netflixoss microservices soa)
Six things we know from the latest FinFisher documents | Privacy International
The publishing of materials from a support server belonging to surveillance-industry giant Gamma International has provided a trove of information for technologists, security researchers and activists. This has given the world a direct insight into a tight-knit industry, which demands secrecy for themselves and their clients, but ultimately assists in the violation human rights of ordinary people without care or reproach. Now for the first time, there is solid confirmation of Gamma’s activities from inside the company’s own files, despite their denials, on their clients and support provided to a range of governments.
(tags: finfisher gamma-international privacy surveillance iphone android rootkits wiretapping germany privacy-international spying bahrain turkmenistan arab-spring egypt phones mobile)
BAI says Mooney Show was wrong to broadcast programme supporting same-sex marriage
This is a terrible decision. As Fintan O’Toole wrote afterwards: [The] ‘BAI decision actually makes the point: a gay couple is a political “issue”; a straight couple is just a couple’
(tags: ireland law bai radio derek-mooney same-sex-marriage gay equal-rights)
The Internet’s Original Sin – The Atlantic
Ethan Zuckerberg: ‘It’s not too late to ditch the ad-based business model and build a better web.’
(tags: advertising business internet ads business-models the-atlantic ethan-zuckerberg via:anildash web privacy surveillance google)
Comment #28 : Bug #255161 : Bugs : “cupsys” package : Ubuntu
file(1) bug causes the input Postscript file to be misidentified as an Erlang JAM file if it contains the string ‘Tue’ starting at byte 4.
(tags: via:hackernews file unix cups printing funny bugs fail ubuntu linux)
Syria’s 2012 internet disconnection wasn’t on purpose
According to Edward Snowden, it was a side-effect of the NSA attempting to install an exploit in one of the core routers at a major Syrian ISP, and accidentally bricking the router
(tags: routers exploits hacking software tao nsa edward-snowden syria internet privacy)
Edward Snowden: The Untold Story | Threat Level | WIRED
Snowden interviewed by James “The Puzzle Palace” Bamford, no less
(tags: james-bamford nsa edward-snowden wired interviews toread leaks whistleblowers us-politics)
Profiling Hadoop jobs with Riemann
I’ve built a very simple distributed profiler for soft-real-time telemetry from hundreds to thousands of JVMs concurrently. It’s nowhere near as comprehensive in its analysis as, say, Yourkit, but it can tell you, across a distributed system, which functions are taking the most time, and what their dominant callers are.
Potentially useful.(tags: riemann profiling aphyr hadoop emr performance monitoring)
-
the world’s largest permanent scale model of the Solar System. The Sun is represented by the Ericsson Globe in Stockholm, the largest hemispherical building in the world. The inner planets can also be found in Stockholm but the outer planets are situated northward in other cities along the Baltic Sea. The system was started by Nils Brenning and Gösta Gahm and is on the scale of 1:20 million.
(via JK)(tags: scale models solar-system astronomy sun sweden science cool via:jk)
All Data Are Belong to AWS: Streaming upload via Fluentd
Fluentd looks like a decent foundation for tailing/streaming event processing in Ruby, supporting batched output to S3 and a bunch of other AWS services, Kafka, and RabbitMQ for output. Claims to have ok performance, despite its Rubbitude. However, its high-availability story is shite, so not to be used where availability is important
(tags: ruby rabbitmq kafka tail event-streaming cep event-processing s3 aws sqs fluentd)
Twitter / mzmyslowski: Why Nigerian scam emails are so poorly written
Great explanation from MS Research’s Corman Herley
(tags: corman-herley microsoft research spam nigerian-scam 419 scams conversion targeting mugus twitter)
-
install inotify-tools, then: ‘while true do inotifywait -r -e modify -e create -e close . ./run.sh done’ #opscookie
(tags: inotify al-tobey one-liners unix hacks opscookie twitter)
How Stewart “Whole Earth Catalog” Brand helped killed off the metric system in the US
In May of 1981, party people gathered for one of the nerdiest soirees ever to grace lower Manhattan. Billed as the “Foot Ball,” the event was an anti-metric shindig. Its revelers—including author Tom Wolfe and Whole Earth Catalog founder Stewart Brand—had joined to protest the encroachment of the metric system into modern American life. They threw shade on the meter and kilogram, and toasted the simple beauty of old classics like the yard and the pound.
Crazy. (via _stunned)(tags: via:_stunned us-politics tom-wolfe stewart-brand luddism metric imperial feet path-dependence)
-
Facebook’s Autoscale service, which scales up/down the fleet in order to optimize power consumption; see also Google’s Pegasus (http://csl.stanford.edu/~christos/publications/2014.pegasus.isca.pdf)
(tags: scaling via:eoinbrazil facebook autoscaling power optimization)
A tick bite can make you allergic to red meat
The bugs harbor a sugar that humans don’t have, called alpha-gal. The sugar is also is found in red meat — beef, pork, venison, rabbit — and even some dairy products. It’s usually fine when people encounter it through food that gets digested. But a tick bite triggers an immune system response, and in that high-alert state, the body perceives the sugar the tick transmitted to the victim’s bloodstream and skin as a foreign substance, and makes antibodies to it. That sets the stage for an allergic reaction the next time the person eats red meat and encounters the sugar.
Via Shane Naughton(tags: ticks meat food allergies immune-system health via:inundata sugar alpha-gal red-meat)
Real time analytics with Netty, Storm, Kafka
Arch of a fairly typical Kafka/Storm realtime ad-tracking setup, from eClick/mc2ads, via Trustin Lee
(tags: via:trustinlee kafka storm netty architecture ad-tracking ads realtime)
AWS Speed Test: What are the Fastest EC2 and S3 Regions?
My god, this test is awful — this is how NOT to test networked infrastructure. (1) testing from a single EC2 instance in each region; (2) uploading to a single test bucket for each test; (3) results don’t include min/max or percentiles, just an averaged measurement for each test. FAIL
(tags: fail testing networking performance ec2 aws s3 internet)
Hacker Redirects Traffic From 19 Internet Providers to Steal Bitcoins | Threat Level | WIRED
‘The attacker specifically targeted a collection of bitcoin mining “pools”–bitcoin-producing cooperatives in which users contribute their computers’ processing power and are rewarded with a cut of the resulting cryptocurrency the pool produces. The redirection technique tricked the pools’ participants into continuing to devote their processors to bitcoin mining while allowing the hacker to keep the proceeds. At its peak, according to the researchers’ measurements, the hacker’s scam was pocketing a flow of bitcoins and other digital currencies including dogecoin and worldcoin worth close to $9,000 a day. “With this kind of hijacking, you can quite easily grab a large collection of clients,” says Pat Litke, one of the Dell researchers. “It takes less than a minute, and you end up with a lot of mining traffic under your control.”’ ‘In total, Stewart and Litke were able to measure $83,000 worth of cryptocurrency stolen in the BGP attack […] but the total haul could be larger’
(tags: bitcoin mining fraud internet bgp routing security attacks hacking)
UK piracy police arrest man suspected of running proxy server (Wired UK)
The site, Immunicity.org, offers a proxy server and a proxy autoconfiguration file (PAC) to tell browsers to access various blocked sites (PirateBay, KickassTorrents et al) via the proxy.
The Police Intellectual Property Crime Unit has arrested a 20-year-old man in Nottingham on suspicion of copyright infringement for running a proxy server providing access to other sites subject to legal blocking orders.
Is operating a proxy server illegal? Interesting. Seems unlikely that this will go to court though. (Via TJ McIntyre)(tags: immunicity via:tjmcintyre police uk piracy proxies http pac pipcu copyright)
-
brilliant. a great threadless sub from Threadless user NickOG back in 2012
(tags: worf star-trek joy-division tee-shirts threadless funny)
-
Excellent: ‘a Twitter-fueled link aggregator that favors new projects/sites over news/articles’ from Andy Baio.
Announcing UberPool, Carpooling with Uber
Ah, I was waiting for this; rest-of-world-style carpooling on demand, in an app. Great stuff
(tags: via:belong.io carpooling uber ride-sharing apps taxi travel uberpool)
Painless, effective peer reviews
This sounds like a nice way to do effective peer-driven team reviews without herculean effort, which were one of the most effective reviewing techniques (along with upwards reviewing of management) I encountered at Amazon. (Yes, the Amazon approach was very time-consuming and universally loathed.) The potential downside I can see is that it doesn’t give the reviewer enough time to revise any review comments they have second thoughts about, whereas written reviews do, but that would be an easy fix at the end of the process. Also, it’s worth noting that in most cases, a good review requires a bit of time to marshal thoughts and come up with a coherent review of a peer, so this doesn’t completely avoid the impact on effort. Still, a definite improvement I would say.
(tags: hr management reviews performance peer-driven-review 360-reviews staff peers work teams amazon)
The problem with OKCupid is the problem with the social web
This is why it really stings whenever somebody turns around and says, “well actually, the terms you’ve signed give us permission to do whatever we want. Not just the thing you were afraid of, but a huge range of things you never thought of.” You can’t on one hand tell us to pay no attention when you change these things on us, and with the other insist that this is what we’ve really wanted to do all along. I mean, fuck me over, but don’t tell me that I really wanted you to fuck me over all along. Because ultimately, the reason you needed me to agree in the first place isn’t just because I’m using your software, but because you’re using my stuff. And the reason I’m letting you use my stuff, and spending all this time working on it, is so that you can show it to people. I’m not just a user of your service, somebody who reads the things that you show it to me: I’m one of the reasons you have anything that you can show to anyone at all.
(tags: users web facebook okcupid terms-of-service jason-kottke privacy a-b-testing experiments ethics)
-
A Java-oriented practical intro to the MinHash duplicate-detection shingling algo
(tags: shingling algorithms minhash hashing duplicates duplicate-detection fuzzy-matching java)
-
The two charts indicate that current EU copyright is very unbalanced. When one side is completely satisfied with the status quo and the other is very unhappy then this is not a balanced situation. Given that a good compromise should leave everybody equally unhappy, the results of the consultation also show the direction for copyright reform efforts of the new EU Commission: re-balancing copyright requires at least some reform as demanded by end users and institutional users, most importantly a more harmonized and flexible system of exceptions and limitations.
‘TCP And The Lower Bound of Web Performance’ [pdf, slides]
John Rauser, Velocity, June 2010. Good data on real-world web perf based on the limitations which TCP and the speed of light impose
(tags: tcp speed-of-light performance web optimization john-rauser)
-
This is a yet another Java collections library of primitive specializations. Java 6+. Apache 2.0 license. Currently only hash sets and hash maps are implemented.
(tags: openhft performance java jvm collections asl hashsets hashmaps data-structures)
China detains 1,530 in telecom spam crackdown
via Christopher Soghoian: ‘IMSI catchers/fake base stations are out of control in China. The gov shut down 24 IMSI catcher factories, 1500+ people were arrested.’
(tags: privacy spam china imsi-catchers mobile 3g gsm phones)
Does This Soldier’s Instagram Account Prove Russia Is Covertly Operating In Ukraine?
“sitting around, working on a buk, listening to music, basically a good sunday”
(tags: ukraine buzzfeed politics sam missiles mh-17 war-crimes russia facebook instagram social-media whoops)
UK private copying exception plans face possible legal action
Under the proposed private copying exception, individuals in the UK would be given a new right to make a copy of copyrighted material they have lawfully and permanently acquired for their private use, provided it was not for commercial ends. Making a private copy of the material in these circumstances would not be an act of copyright infringement, although making a private copy of a computer program would still be prohibited under the plans. There is no mechanism envisaged in the draft legislation for rights holders to be specifically compensated for the act of private copying. This prompted the Joint Committee on Statutory Instruments (JCSI), tasked with scrutinising the proposals, to warn parliamentarians that the rules may be deemed to be in breach of EU copyright laws as a result of the lack of ‘fair compensation’ mechanism. […] “We are disappointed that the private copying exception will be introduced without providing fair compensation for British songwriters, performers and other rights holders within the creative sector. A mechanism for fair compensation is a requirement of European law. In response we are considering our legal options,” [UK Music] said.
(tags: uk law copyright music copying private-copying personal infringement piracy transcoding backup)
Moominvalley Map Print | Magic Pony
Lovely print! Shipping would be a bit crazy, though. There has to be an english-language print of one of Tove Jansson’s maps on sale somewhere in Europe…
(tags: prints moomins moominvalley maps hattifatteners magic-pony tove-jannson art)
-
Ladyada’s intro to electronics and microcontrollers using Arduino. Some day I’ll get around to refreshing my memory, it’s been years since I fiddled with a resistor ;)
(tags: electronics arduino hardware gadgets learning tutorial microcontrollers embedded-systems ladyada)
How to take over the computer of any JVM developer
To prove how easy [MITM attacking Mavencentral JARs] is to do, I wrote dilettante, a man-in-the-middle proxy that intercepts JARs from maven central and injects malicious code into them. Proxying HTTP traffic through dilettante will backdoor any JARs downloaded from maven central. The backdoored version will retain their functionality, but display a nice message to the user when they use the library.
(tags: jars dependencies java build clojure security mitm http proxies backdoors scala maven gradle)
Spain pushes for ‘Google tax’ to restrict linking
The government wants to put a tax on linking on the internet. They say that if you want to link to some newspaper’s content, you have to pay a tax. The primary targets of this law are Google News and other aggregators. It would be absurd enough just like that, but the law goes further: they declared it an “inalienable right” so even if I have a blog or a new small digital media publication and I want to let people freely link to my content, I can’t opt-out–they are charging the levy, and giving it to the big press media. It was just the last and only way that the old traditional media companies can get some money from the government, and they strongly lobbied for it. The bill has passed in the Congress where the party in the government has majority (PP, Partido Popular) and it’s headed to the Senate, where they have a majority also.
(tags: spain stupidity law via:boingboing linking links web news google google-news newspapers old-media taxes)
Keyes New Starter Kit for Arduino Fans
$53 for a reasonable-looking Arduino starter kit, from DealExtreme. cheap cheap! In the inimitable DX style:
Keyes new beginner starter kit, pay more attention to beginners learning. Users can get rid of the difficult technological learning, from module used to quick start production.
(tags: learning arduino hardware hacking robotics toys dealextreme tobuy)
Check If A Hotel’s WiFi Sucks Before It’s Too Late
http://www.hotelwifitest.com/ and http://speedspot.org/ .
-
a nice summarisation of the state of pipe/stream-oriented collection operations in various languages, from Martin Fowler
(tags: martin-fowler patterns coding ruby clojure streams pipelines pipes unix lambda fp java languages)
REST Commander: Scalable Web Server Management and Monitoring
We dynamically monitor and manage a large and rapidly growing number of web servers deployed on our infrastructure and systems. However, existing tools present major challenges when making REST/SOAP calls with server-specific requests to a large number of web servers, and then performing aggregated analysis on the responses. We therefore developed REST Commander, a parallel asynchronous HTTP client as a service to monitor and manage web servers. REST Commander on a single server can send requests to thousands of servers with response aggregation in a matter of seconds. And yes, it is open-sourced at http://www.restcommander.com. Feature highlights: Click-to-run with zero installation; Generic HTTP request template supporting variable-based replacement for sending server-specific requests; Ability to send the same request to different servers, different requests to different servers, and different requests to the same server; Maximum concurrency control (throttling) to accommodate server capacity; Commander itself is also “as a service”: with its powerful REST API, you can define ad-hoc target servers, an HTTP request template, variable replacement, and a regular expression all in a single call. In addition, intuitive step-by-step wizards help you achieve the same functionality through a GUI.
(tags: rest http clients load-testing ebay soap async testing monitoring)
South Downs litter picker has truck named after him – West Sussex County Times
This is amazing. In http://www.newyorker.com/magazine/2014/06/30/stepping-out-3 , David Sedaris had written: ‘in recognition of all the rubbish I’ve collected since getting my Fitbit, my local council is naming a garbage truck after me’; naturally, I assumed he was joking, but it looks like he wasn’t:
Horsham District Council has paid thanks to a volunteer who devotes a great deal of time and energy to walking many miles clearing litter from near where he lives as well as surrounding areas. David Sedaris litter picks in areas including Parham, Coldwaltham, Storrington and beyond. In recognition for all his fantastic work and dedication and as a token of Horsham District Council’s appreciation, the council has named one of their waste vehicles after him. The vehicle, bedecked with its bespoke ‘Pig Pen Sedaris’ sign was officially unveiled by the Lord-Lieutenant of West Sussex Mrs Susan Pyper at an outdoor ceremony on July 23.
Best of all, the article utterly fails to mention who he is. Amazing. (via John Braine)(tags: via:john-braine funny david-sedaris litter uk horsham rubbish garbage cleaning volunteering walking)
-
Heapster provides an agent library to do heap profiling for JVM processes with output compatible with Google perftools. The goal of Heapster is to be able to do meaningful (sampled) heap profiling in a production setting.
Used by Twitter in production, apparently.(tags: heap monitoring memory jvm java performance)
The Network is Reliable – ACM Queue
Peter Bailis and Kyle Kingsbury accumulate a comprehensive, informal survey of real-world network failures observed in production. I remember that April 2011 EBS outage…
(tags: ec2 aws networking outages partitions jepsen pbailis aphyr acm-queue acm survey ops)
This tree produces 40 different types of fruit
An art professor from Syracuse University in the US, Van Aken grew up on a family farm before pursuing a career as an artist, and has combined his knowledge of the two to develop his incredible Tree of 40 Fruit. In 2008, Van Aken learned that an orchard at the New York State Agricultural Experiment Station was about to be shut down due to a lack of funding. This single orchard grew a great number of heirloom, antique, and native varieties of stone fruit, and some of these were 150 to 200 years old. To lose this orchard would render many of these rare and old varieties of fruit extinct, so to preserve them, Van Aken bought the orchard, and spent the following years figuring out how to graft parts of the trees onto a single fruit tree. […] Aken’s Tree of 40 Fruit looks like a normal tree for most of the year, but in spring it reveals a stunning patchwork of pink, white, red and purple blossoms, which turn into an array of plums, peaches, apricots, nectarines, cherries and almonds during the summer months, all of which are rare and unique varieties.
(tags: fruit art amazing food agriculture grafting orchards sam-van-aken farming)
-
we believe MDD is equal parts engineering technique and cultural process. It separates the notion of monitoring from its traditional position of exclusivity as an operations thing and places it more appropriately next to its peers as an engineering process. Provided access to real-time production metrics relevant to them individually, both software engineers and operations engineers can validate hypotheses, assess problems, implement solutions, and improve future designs.
Broken down into the following principles: ‘Instrumentation-as-Code’, ‘Single Source of Truth’, ‘Developers Curate Visualizations and Alerts’, ‘Alert on What You See’, ‘Show me the Graph’, ‘Don’t Measure Everything (YAGNI)’. We do all of these at Swrve, naturally (a technique I happily stole from Amazon).(tags: metrics coding graphite mdd instrumentation yagni alerting monitoring graphs)
Auto Scale DynamoDB With Dynamic DynamoDB
Nicely-packaged auto-scaler for DynamoDB
(tags: dynamodb autoscaling scalability provisioning aws ec2 cloudformation)
Google’s mighty mess-up on ‘right to be forgotten’ – Independent.ie
In this context, the search giant says that it has “a team of people reviewing each application individually”. Really? Did this team of people decide that redacting links to an article reporting a criminal conviction was consistent with an individual’s right to privacy and ‘right to be forgotten’? Either Google is deliberately letting egregious errors through to try and bait journalists and freedom of expression activists into protesting or its system at vetting ‘right to be forgotten’ applications is awfully flawed.
(tags: google right-to-be-forgotten privacy law ireland adrian-weckler journalism freedom-of-expression censorship redaction)
“Ark: A Real-World Consensus Implementation” [paper]
“an implementation of a consensus algorithm similar to Paxos and Raft, designed as an improvement over the existing consensus algorithm used by MongoDB and TokuMX.” It’ll be interesting to see how this gets on in review from the distributed-systems community. The phrase “similar to Paxos and Raft” is both worrying and promising ;)
(tags: paxos raft consensus algorithms distsys distributed leader-election mongodb tokumx)
A Japanese Artist Launches Plants Into Space
This is amazing.
though the vessel was found on the ground, the flowers were not.
(tags: japan art bonsai flowers space nevada black-rock-desert exobiotanica)
‘Identifying Back Doors, Attack Points and Surveillance Mechanisms in iOS Devices’
lots of scary stuff in this presentation from this year’s Hackers On Planet Earth conf. I’m mainly interested to find out that Jonathan “D-Spam” Zdziarski was also a jailbreak dev-team member until around iOS 4 ;)
(tags: d-spam jonathan-zdziarski security apple ios iphone surveillance bugging)
-
a Chrome extension to aid working with REST APIs. Formats XML and JSON responses, supports file uploads, key/value editors, autocomplete, open source under ASL2
(tags: open-source chrome extensions browser postman rest hateoas api xml json web-services via:eonnen)
-
A Go implementation of Greenwald-Khanna streaming quantiles: http://infolab.stanford.edu/~datar/courses/cs361a/papers/quantiles.pdf – ‘a new online algorithm for computing approximate quantile summaries of very large data sequences with a worst-case space requirement of O(1/e log eN))’
(tags: quantiles go algorithms greenwald-khanna percentiles streaming cep space-efficient)
-
Some great tips on managing a busy calendar, from Etsy’s managers. Block out time; refuse double-booked meetings by default; rely on apps; office hours. Thankfully I have a pretty slim calendar these days, but bookmarking for future use…
(tags: calendar etsy via:kellan google google-calendar office-hours life-hacks hacks tips managing managers scheduling)
Nanex: “The stock market is rigged” [by HFTs]
All this evidence points to one inescapable conclusion: the order cancellations and trade executions just before, and during the trader’s order were not a coincidence. This is premeditated, programmed theft, plain and simple. Michael Lewis probably said it best when he told 60 Minutes that the stock market is rigged.
Nanex have had enough, basically. Mad stuff.(tags: hft stocks finance market trading nanex 60-minutes michael-lewis scams sec regulation low-latency exploits hacks)
Boundary’s new server monitoring free offering
‘High resolution, 1 second intervals for all metrics; Fluid analytics, drag any graph to any point in time; Smart alarms to cut down on false positives; Embedded graphs and customizable dashboards; Up to 10 servers for free’ Pre-registration is open now. Could be interesting, although the limit of 10 machines is pretty small for any production usage
(tags: boundary monitoring network ops metrics alarms tcp ip netstat)
-
A really excellent-looking workflow/orchestration engine for Hadoop, Pig, Hive, Redshift and other ETL jobs, featuring inter-job dependencies, cron-like scheduling, and failure handling. Open source, from Spotify
(tags: workflow orchestration scheduling cron spotify open-source luigi redshift pig hive hadoop emr jobs make dependencies)
Obama administration says the world’s servers are ours | Ars Technica
In its briefs filed last week, the US government said that content stored online doesn’t enjoy the same type of Fourth Amendment protections as data stored in the physical world. The government cited (PDF) the Stored Communications Act (SCA), a President Ronald Reagan-era regulation.
Michael McDowell has filed a declaration in support of MS’ position (attached to that article a couple of paras down) suggesting that the MLAT between the US and Ireland is the correct avenue.(tags: privacy eu us-politics microsoft michael-mcdowell law surveillance servers sca internet)
-
‘This tool can be described as a Tiny Dirty Linux Only C command that looks for coreutils basic commands (cp, mv, dd, tar, gzip/gunzip, cat, …) currently running on your system and displays the percentage of copied data. It can now also display an estimated throughput (using -w flag).’
(tags: coreutils via:pixelbeat linux ops hacks procfs dataviz unix)
“In Search of an Understandable Consensus Algorithm”
Diego Ongaro and John Ousterhout, USENIX ATC 2014 — won best paper for this paper on the Raft algorithm. (via Eoin Brazil)
(tags: raft consensus algorithms distcomp john-ousterhout via:eoinbrazil usenix atc papers paxos)
-
Great map-comparison tool from Jef Poskanzer
(tags: jef-poskanzer mapping maps comparison visualization geo world cities)
Divinity: Original Sin review | PC Gamer
I’ve become accustomed to RPGs that lock away combat and magic within their own part of the game. I’m used to the idea that a fireball won’t work unless it’s aimed at an enemy, or that every environmental hazard will be placed such that I’m guaranteed to be able to get past it. I’m used to the idea that some characters can be killed and some can’t, that some obstacles are destructible and others are ‘just furniture’. Divinity shrugs off those assumptions. Combat might be turn-based when you’re fighting an enemy, but there’s nothing stopping you from waving your sword around in the middle of town. Fling a fireball at some innocent barrels and you’ll start a fresh fire of your own, and this time the locals won’t be applauding when you rush to put it out.
wow, this sounds great. (via Paul Moloney)(tags: games divinity-original-sin rpgs gaming via:oceanclub)
-
a client side IPC library that is battle-tested in cloud. It provides the following features: Load balancing; Fault tolerance; Multiple protocol (HTTP, TCP, UDP) support in an asynchronous and reactive model; Caching and batching.
I like the integration of Eureka and Hystrix in particular, although I would really like to read more about Eureka’s approach to availability during network partitions and CAP. https://groups.google.com/d/msg/eureka_netflix/LXKWoD14RFY/-5nElGl1OQ0J has some interesting discussion on the topic. It actually sounds like the Eureka approach is more correct than using ZK: ‘Eureka is available. ZooKeeper, while tolerant against single node failures, doesn’t react well to long partitioning events. For us, it’s vastly more important that we maintain an available registry than a necessary consistent registry. If us-east-1d sees 23 nodes, and us-east-1c sees 22 nodes for a little bit, that’s OK with us.’ See also http://ispyker.blogspot.ie/2013/12/zookeeper-as-cloud-native-service.html which corroborates this:I went into one of the instances and quickly did an iptables DROP on all packets coming from the other two instances. This would simulate an availability zone continuing to function, but that zone losing network connectivity to the other availability zones. What I saw was that the two other instances noticed that the first server “going away”, but they continued to function as they still saw a majority (66%). More interestingly the first instance noticed the other two servers “going away” dropping the ensemble availability to 33%. This caused the first server to stop serving requests to clients (not only writes, but also reads). […] To me this seems like a concern, as network partitions should be considered an event that should be survived. In this case (with this specific configuration of zookeeper) no new clients in that availability zone would be able to register themselves with consumers within the same availability zone. Adding more zookeeper instances to the ensemble wouldn’t help considering a balanced deployment as in this case the availability would always be majority (66%) and non-majority (33%).
(tags: netflix ribbon availability libraries java hystrix eureka aws ec2 load-balancing networking http tcp architecture clients ipc)
The Myth of Schema-less [NoSQL]
We don’t seem to gain much in terms of database flexibility. Is our application more flexible? I don’t think so. Even without our schema explicitly defined in our database, it’s there… somewhere. You simply have to search through hundreds of thousands of lines to find all the little bits of it. It has the potential to be in several places, making it harder to properly identify. The reality of these codebases is that they are error prone and rarely lack the necessary documentation. This problem is magnified when there are multiple codebases talking to the same database. This is not an uncommon practice for reporting or analytical purposes. Finally, all this “flexibility” rears its head in the same way that PHP and Javascript’s “neat” weak typing stabs you right in the face. There are some somethings you can be cavalier about, and some things you should be strict about. Your data model is one you absolutely need to be strict on. If a field should store an int, it should store nothing else. Not a string, not a picture of a horse, but an integer. It’s nice to know that I have my database doing type checking for me and I can expect a field to be the same type across all records. All this leads us to an undeniable fact: There is always a schema. Wearing “I don’t do schema” as a badge of honor is a complete joke and encourages a terrible development practice.
(tags: nosql databases storage schema strong-typing)
-
from yesterday’s AWS Summit in NYC:
Cheat sheet of EBS-optimized instances. http://t.co/vmTlhUtpWk Optimize your queue depth to achieve lower latency & highest IOPS. http://t.co/EO48oa0D6X When configuring your RAID, use a stripe size of 128KB or 256KB. http://t.co/N0ldtFJ4t6 Use larger block size to speed up the pre-warming process. http://t.co/8UoIeWE2px
173 million 2013 NYC taxi rides shared on BigQuery : bigquery
Interesting! (a) there’s a subreddit for Google BigQuery, with links to interesting data sets, like this one; (b) the entire 173-million-row dataset for NYC taxi rides in 2013 is available for querying; and (c) the tip percentage histogram is cool.
(tags: datasets bigquery sql google nyc new-york taxis data big-data histograms tipping)
“Pitfalls of Object Oriented Programming”, SCEE R&D
Good presentation discussing “data-oriented programming” — the concept of optimizing memory access speed by laying out large data in a columnar format in RAM, rather than naively in the default layout that OOP design suggests
(tags: columnar ram memory optimization coding c++ oop data-oriented-programming data cache performance)
Google’s Influential Papers for 2013
Googlers across the company actively engage with the scientific community by publishing technical papers, contributing open-source packages, working on standards, introducing new APIs and tools, giving talks and presentations, participating in ongoing technical debates, and much more. Our publications offer technical and algorithmic advances, feature aspects we learn as we develop novel products and services, and shed light on some of the technical challenges we face at Google. Below are some of the especially influential papers co-authored by Googlers in 2013.
(tags: google papers toread reading 2013 scalability machine-learning algorithms)
-
‘Leak of the secret German Internet Censorship URL blacklist BPjM-Modul’. Turns out there’s a blocklist of adult-only or prohibited domains issued by a German government department, The Federal Department for Media Harmful to Young Persons (German: “Bundesprüfstelle für jugendgefährdende Medien” or BPjM), issued in the form of a list of hashes of those domains. These were extracted from an AVM router, then the hashes were brute forced using several other plaintext URL blocklists and domain lists. Needless to say, there’s an assortment of silly false positives, such as the listing of the website for the 1997 3D Realms game “Shadow Warrior”: http://en.wikipedia.org/wiki/Shadow_Warrior
(tags: hashes reversing reverse-engineering germany german bpjm filtering blocklists blacklists avm domains censorship fps)
Brave Men Take Paternity Leave – Gretchen Gavett – Harvard Business Review
The use of paternity leave has a “snowball effect”:
In the end, Dahl says, “coworkers and brothers who were linked to a father who had his child immediately after the [Norwegian paid paternity leave] reform — versus immediately before the reform — were 3.5% and 4.7% more likely, respectively, to take parental leave.” But when a coworker actually takes parental leave, “the next coworker to have a child at his workplace is 11% more likely to take paternity leave.” Slightly more pronounced, the next brother to have a child is 15% more likely to take time off. And while any male coworker taking leave can reduce stigma, the effect of a manager doing so is more profound. Specifically, “the estimated peer effect is over two and a half times larger if the peer father is predicted to be a manager in the firm as opposed to a regular coworker.”
(tags: paternity-leave parenting leave work norway research)
-
by Jeffrey Dean and Luiz Andre Barroso, Google. A selection of Google’s architectural mechanisms used to defeat 99th-percentile latency spikes: hedged requests, tied requests, micro-partitioning, selective replication, latency-induced probation, canary requests.
(tags: google architecture distcomp soa http partitioning replication latency 99th-percentile canary-requests hedged-requests)
Breaking Spotify DRM with PANDA
Reverse engineering a DRM implementation, by instrumenting a VM and performing entropy/compressability analysis on function call inputs and outputs. Impressive
(tags: reversing spotify drm panda vm compression entropy compressability qemu via:hn)
-
Book a domestic cleaner online in 60 seconds; “like Hailo for cleaners” apparently. Live in Dublin, London, Manchester, Birmingham and Leeds. Use code HASSLEDUBLIN for 15% off
(tags: hailo cleaners hassle via:hailo domestic home services b2c)
Layered Glass Table Concept Creates a Cross-Section of the Ocean
beautiful stuff — and a snip at only UKP 5,800 ex VAT. it’d make a good DIY project though ;)
(tags: art tables glass layering 3d cross-sections water ocean sea mapping cartography layers this-is-colossal design furniture)
Two traps in iostat: %util and svctm
Marc Brooker:
As a measure of general IO busyness %util is fairly handy, but as an indication of how much the system is doing compared to what it can do, it’s terrible. Iostat’s svctm has even fewer redeeming strengths. It’s just extremely misleading for most modern storage systems and workloads. Both of these fields are likely to mislead more than inform on modern SSD-based storage systems, and their use should be treated with extreme care.
(tags: ioutil iostat svctm ops ssd disks hardware metrics stats linux)
New AWS Web Services region: eu-central-1 (soon)
Iiiinteresting. Sounds like new anti-NSA-snooping privacy laws will be driving a lot of new mini-regions in AWS. Hope Amazon have their new-region-standup process a little more streamlined by now than when I was there ;)
How A Spam Newsletter Caused a Bank Run in Bulgaria
According to the Bulgarian National Security Agency (see here, for a reporting in English), an investment company that “built a network of associated companies for marketing services” that was used to diffuse panic by means of an alert, uncomfortably titled “Information Bulletin of on the Risk of Deposits in Bulgarian Banks”. The “bulletin” claimed – Bloomberg reports – KTB was undergoing a liquidity shortage. The message apparently also said that the government deposit guarantee fund was under-capitalised to meet possible repayments, that banks could go bankrupt and that the peg of the currency with the euro could be broken. Allegedly, the alert was diffused by text, email and even Facebook messages, thus ensuring a very widespread outreach. In a country that in 1997 underwent a very serious banking crisis featuring all these characteristics – whose memory is still fresh – this was enough to spur panic.
(tags: spam banking bulgaria banks euro panic facebook social-media)
New Russian Law To Forbid Storing Russians’ Data Outside the Country – Slashdot
On Friday Russia’s parliament passed a law “which bans online businesses from storing personal data of Russian citizens on servers located abroad[.] … According to ITAR-TASS, the changes to existing legislation will come into effect in September 2016, and apply to email services, social networks and search engines, including the likes of Facebook and Google. Domain names or net addresses not complying with regulations will be put on a blacklist maintained by Roskomnadzor (the Federal Supervision Agency for Information Technologies and Communications), the organisation which already has the powers to take down websites suspected of copyright infringement without a court order. In the case of non-compliance, Roskomnadzor will be able to impose ‘sanctions,’ and even instruct local Internet Service Providers (ISPs) to cut off access to the offending resource.”
(tags: russia privacy nsa censorship protectionism internet web)
Irish parliament pressing ahead with increased access to retained telecoms data
While much of the new bill is concerned with the dissolution of the Competition Authority and the National Consumer Agency and the formation of a new merged Competition and Consumer Protection Commission (CCPC) the new bill also proposed to extend the powers of the new CCPC to help it investigate serious anticompetitive behaviour. Strikingly the new bill proposes to give members of the CCPC the power to access data retained under the Communications (Retention of Data) Act 2011. As readers will recall this act implements Directive 2006/24/EC which obliges telecommunications companies to archive traffic and location data for a period of up to two years to facilitate the investigation of serious crime. Ireland chose to implement the maximum two year retention period and provided access to An Garda Siochana, The Defence Forces and the Revenue Commissioners. The current reform of Irish competition law now proposes to extend data access powers to the members of the CCPC for the purposes of investigating cartel offences.
(tags: data-retention privacy surveillance competition ccpc ireland law dri)
NSA: Linux Journal is an “extremist forum” and its readers get flagged for extra surveillance
DasErste.de has published the relevant XKEYSCORE source code, and if you look closely at the rule definitions, you will see linuxjournal.com/content/linux* listed alongside Tails and Tor. According to an article on DasErste.de, the NSA considers Linux Journal an “extremist forum”. This means that merely looking for any Linux content on Linux Journal, not just content about anonymizing software or encryption, is considered suspicious and means your Internet traffic may be stored indefinitely.
This is, sadly, entirely predictable — that’s what happens when you optimize the system for over-sampling, with poor oversight.(tags: false-positives linuxjournal linux terrorism tor tails nsa surveillance snooping xkeyscore selectors oversight)
-
a C++ library adding some modern language features like Option, Try, Stopwatch, and other Guava-ish things (via @cscotta)
Tor exit node operator prosecuted in Austria
‘The operator of an exit node is guilty of complicity, because he enabled others to transmit content of an illegal nature through the service.’ Via Tony Finch.
(tags: austria tor security law liability internet tunnelling eu via:fanf)
IRS says free software projects can’t be nonprofits – Boing Boing
In a disturbing precedent, the Yorba Foundation, which makes apps for [GNOME], has had its nonprofit status application rejected by the IRS because some of [its] projects may benefit for-profit entities.
(tags: law us gnome yorba-foundation linux gpl free-software oss nonprofits 501c3 tax)
How to perform a load/latency test, correcting for coordinated-omission error
p-code from Gil Tene
(tags: gil-tene coordinated-omission measurement jmh latency testing errors code)
Questioning the Lambda Architecture
Jay Kreps (Kafka, Samza) with a thought-provoking post on the batch/stream-processing dichotomy
(tags: jay-kreps toread architecture data stream-processing batch hadoop storm lambda-architecture)
-
Urban Airship with a new open-source Graphite front-end UI; similar enough to Grafana at a glance, no releases yet, ASL2-licensed
(tags: graphite metrics ui front-ends open-source ops)