The Magnetic Air Bonsai Creates Surreal Levitating Plants
this is amazing. $200 on Kickstarter!
(tags: kickstarter bonsai plants gardening levitation air-bonsai cool)
Category: Uncategorized
AWS Certificate Manager – Deploy SSL/TLS-Based Apps on AWS
Very nifty -- autodeploys free wildcard certs to ELBs and Cloudfront. HN discussion thread is pretty good: https://news.ycombinator.com/item?id=10947186
(tags: ssl tls certificates ops aws cloudfront elb)
AWS re:Invent 2015 | (NET403) Another Day, Another Billion Packets - YouTube
Eric Brandwine details the internal workings of Amazon VPC
(tags: eric-brandwine vpc aws amazon networking security vlans sdn)
Unikernels are unfit for production - Blog - Joyent
Bryan Cantrill gives unikernels a 10-point dismissal. This is great
(tags: unikernels flavour-of-the-month devops joyent bryan-cantrill docker containers ops)
"So you have a mess on your hands" [png]
Excellent flowchart of how to fix common git screwups (via ITC slack)
(tags: git reference flowchart troubleshooting help coding via:itc)
Journalists, this GSOC story isn’t all about you, you know
Karlin Lillington in the Irish Times, going through journos for a shortcut:
All the hand-wringing from journalists, unions and media companies – even politicians and ministers – over the GSOC’s accessing of journalist’s call records? Oh, please. What wilful ignorance, mixed with blatant hypocrisy. Where have you all been for the past decade and a half, as successive Irish governments and ministers for justice supported and then rammed through legislation for mandatory call data retention for one of the longest periods in the world, with some of the weakest legal constraints and oversight?
(tags: karlin-lillington privacy data-protection dri law journalists gsoc surveillance data-retention)
-
Good plug for emrfs for encryption
Gilt Groupe Is a Cautionary Tale for Startup Employees Banking on Stock Options | Re/code
Good explanation of why RSUs are becoming increasingly common
(tags: rsus startups shares share-options work)
Powering your Amazon ECS Clusters with Spot Fleet
sounds feasible
-
Ughhhh.
Amazon Echo sends your WiFi password to Amazon. No option to disable. Trust us it's in an "encrypted file"
(tags: amazon echo wifi passwords security data-privacy data-protection)
21 of the most Stoneybatter things that have ever happened
ah, <3 the 'batter
(tags: stoneybatter dublin hipsters funny)
-
This is absolutely appalling. IP law gone mad:
DNC Parks & Resorts at Yosemite, Inc (a division of one of the largest privately owned companies in the world) used to have the concessions to operate various businesses around Yosemite National Park. Now that they've been fired, they're using some decidedly dubious trademark to force the Park Service to change the names of buildings and locations that have stood for as much as a century, including some that have been designated national landmarks. The Parks Service has caved to these requests as it readies the park for its centennial celebration. It will not only change the names of publicly owned landmarks -- such as the Ahwahnee hotel, Yosemite Lodge, the Wawona Hotel, Curry Village, and Badger Pass ski area -- it will also have to change all its signs, maps and guidebooks.
(tags: yosemite ip trademarks law fiasco national-parks usa)
-
'THE DRAGNET: How a man accused of million-dollar fraud uncovered a never before seen, secret surveillance device'
(tags: stingrays crime fraud surveillance mobile police imsi-catchers)
Introducing: The World's First Fully Functional 3D Printed Watch: The Christoph Laimer Tourbillon
wow
(tags: watches 3d-printing clocks things via:bruces)
-
Online chart maker for CSV and Excel data; make charts and dashboards online. One really nice feature is that charts made this way get permalinks, and can be easily inlined as PNGs or HTML5 divs. (See https://www.vividcortex.com/blog/analyzing-sparks-mpp-scalability-with-the-usl for an example.)
(tags: data javascript python tools visualization dataviz charts graphing web plotly plots graphs)
CRISPR Patents Spark Fight to Control Genome Editing | MIT Technology Review
Patents ruin everything, CRISPR edition
(tags: crispr algorithms gene-editing genetics genomics genes patents)
-
Nchan is a scalable, flexible pub/sub server for the modern web, built as a module for the Nginx web server. It can be configured as a standalone server, or as a shim between your application and tens, thousands, or millions of live subscribers. It can buffer messages in memory, on-disk, or via Redis. All connections are handled asynchronously and distributed among any number of worker processes. It can also scale to many nginx server instances with Redis. Messages are published to channels with HTTP POST requests or websockets, and subscribed also through websockets, long-polling, EventSource (SSE), old-fashioned interval polling, and more. Each subscriber can listen to up to 255 channels per connection, and can be optionally authenticated via a custom application url. An events meta channel is also available for debugging.
Also now supports HTTP/2. This used to be called the Nginx HTTP Push Module, and I used it with great results in that form. This is the way to do HTTP push in all its forms....(tags: nginx pubsub websockets sse http http-push http2 redis long-polling nchan)
David Bowie: Father Of The Sleng Teng Riddim
A great theory!
I don’t have contact information for Hiroko Okuda, but I am positive that the track she is referring to [as the source of the Casiotone MT-40 "rock" preset] is “Hang Onto Yourself” by David Bowie.
(tags: david-bowie sleng-teng-riddim riddims reggae casio presets samples history music trivia)
The Open Guide To Equity Compensation
A very US-oriented, but still useful, reference for all the aspects of stock options, RSUs, and other forms of equity compensation
(tags: equity startups money pay salary rsus stock-options stock)
About Microservices, Containers and their Underestimated Impact on Network Performance
shock horror, Docker-SDN layers have terrible performance. Still pretty lousy perf impacts from basic Docker containerization, presumably without "--net=host" (which is apparently vital)
(tags: docker performance network containers sdn ops networking microservices)
Netty @Apple: Large Scale Deployment/Connectivity
'Norman Maurer presents how Apple uses Netty for its Java based services and the challenges of doing so, including how they enhanced performance by participating in the Netty open source community. Maurer takes a deep dive into advanced topics like JNI, JVM internals, and others.'
(tags: apple netty norman-maurer java jvm async talks presentations)
-
excellent blueprint-style poster covering all the major cocktails
(tags: cocktails drinks engineering posters blueprints graphics pdf)
Introducing dumb-init, an init system for Docker containers
Yelp fixing one of the sillier shortcomings of Docker
(tags: docker tools yelp init containers signals unix linux dumb-init)
Can't sign in to Google calendar on my Samsung refrigerator
LOL, internet of broken things (via Dave Bolger)
(tags: internetofshit iot fail samsung google apis fridges connected future via:davebolger)
5 subtle ways you're using MySQL as a queue, and why it'll bite you
Excellent post from Percona. I particularly like that they don't just say "don't use MySQL" -- they give good advice on how it can be made work: 1) avoid polling; 2) avoid locking; and 3) avoid storing your queue in the same table as other data.
(tags: database mysql queueing queue messaging percona rds locking sql architecture)
BBC Digital Media Distribution: How we improved throughput by 4x
Replacing varnish with nginx. Nice deep-dive blog post covering kernel innards
The Importance of Tuning Your Thread Pools
Excellent blog post on thread pools, backpressure, Little's Law, and other Hystrix-related topics (PS: use Hystrix)
(tags: hystrix threadpools concurrency java jvm backpressure littles-law capacity)
-
good explanation of this new data structure for searching multidimensional data
(tags: search lucene bkd-trees searching data-structures)
The Guinness Brewer Who Revolutionized Statistics
William S. Gosset, discoverer of the Student's T-Test. Amazon should have taken note of this trick:
Upon completing his work on the t-distribution, Gosset was eager to make his work public. It was an important finding, and one he wanted to share with the wider world. The managers of Guinness were not so keen on this. They realized they had an advantage over the competition by using this method, and were not excited about relinquishing that leg up. If Gosset were to publish the paper, other breweries would be on to them. So they came to a compromise. Guinness agreed to allow Gosset to publish the finding, as long as he used a pseudonym. This way, competitors would not be able to realize that someone on Guinness’s payroll was doing such research, and figure out that the company’s scientifically enlightened approach was key to their success.
(tags: statistics william-gosset history guinness brewing t-test pseudonyms dublin)
How open-source software developers helped end the Ebola epidemic in Sierra Leone
Little known to the rest of the world, a team of open source software developers played a small but integral part in helping to stop the spread of Ebola in Sierra Leone, solving a payroll crisis that was hindering the fight against the disease. Emerson Tan from NetHope, a consortium of NGOs working in IT and development, told the tale at the Chaos Communications Congress in Hamburg, Germany. “These guys basically saved their country from complete collapse. I can’t overestimate how many lives they saved,” he said about his co-presenters, Salton Arthur Massally, Harold Valentine Mac-Saidu and Francis Banguara, who appeared over video link.
(tags: open-source software coding payroll sierra-leone ebola ccc)
-
A good review of RethinkDB! Hopefully not just because this test is contract work on behalf of the RethinkDB team ;)
I’ve run hundreds of test against RethinkDB at majority/majority, at various timescales, request rates, concurrencies, and with different types of failures. Consistent with the documentation, I have never found a linearization failure with these settings. If you use hard durability, majority writes, and majority reads, single-document ops in RethinkDB appear safe.
(tags: rethinkdb databases stores storage ops availability cap jepsen tests replication)
-
Metrics integration for OkHttp. looks quite nice
How Completely Messed Up Practices Become Normal
on Normalization of Deviance, with a few anecdotes from Silicon Valley. “The gradual process through which unacceptable practice or standards become acceptable. As the deviant behavior is repeated without catastrophic results, it becomes the social norm for the organization.”
(tags: normalization-of-deviance deviance bugs culture ops reliability work workplaces processes norms)
A critical analysis of the Legacy Verified SSL/TLS proposal by CloudFlare & Facebook
The history of real-world CA-based PKI is pretty awful
Incredibly Rare Underwater Footage of a Stray Giant Squid Swimming Around Toyama Bay in Japan
wow, this is great footage
(tags: giant-squid squid cephalopods japan video youtube)
-
hooray, Docker registry here at last
How to inspect SSL/TLS traffic with Wireshark 2
turns out it's easy enough -- Mozilla standardised a debugging SSL session-key logging file format which Wireshark and Chrome support
ImperialViolet - Juniper: recording some Twitter conversations
Adam Langley on the Juniper VPN-snooping security hole:
... if it wasn't the NSA who did this, we have a case where a US government backdoor effort (Dual-EC) laid the groundwork for someone else to attack US interests. Certainly this attack would be a lot easier given the presence of a backdoor-friendly RNG already in place. And I've not even discussed the SSH backdoor. [...]
(tags: primes ecc security juniper holes exploits dual-ec-drbg vpn networking crypto prngs)
Excellent post from Matthew Green on the Juniper backdoor
For the past several years, it appears that Juniper NetScreen devices have incorporated a potentially backdoored random number generator, based on the NSA's Dual_EC_DRBG algorithm. At some point in 2012, the NetScreen code was further subverted by some unknown party, so that the very same backdoor could be used to eavesdrop on NetScreen connections. While this alteration was not authorized by Juniper, it's important to note that the attacker made no major code changes to the encryption mechanism -- they only changed parameters. This means that the systems were potentially vulnerable to other parties, even beforehand. Worse, the nature of this vulnerability is particularly insidious and generally messed up. [....] The end result was a period in which someone -- maybe a foreign government -- was able to decrypt Juniper traffic in the U.S. and around the world. And all because Juniper had already paved the road. One of the most serious concerns we raise during [anti-law-enforcement-backdoor] meetings is the possibility that encryption backdoors could be subverted. Specifically, that a back door intended for law enforcement could somehow become a backdoor for people who we don't trust to read our messages. Normally when we talk about this, we're concerned about failures in storage of things like escrow keys. What this Juniper vulnerability illustrates is that the danger is much broader and more serious than that. The problem with cryptographic backdoors is not that they're the only way that an attacker can break intro our cryptographic systems. It's merely that they're one of the best. They take care of the hard work, the laying of plumbing and electrical wiring, so attackers can simply walk in and change the drapes.
(via Tony Finch)(tags: via:fanf crypto backdoors politics juniper dual-ec-drbg netscreen vpn)
-
good thread of AWS' shortcomings -- so many services still don't handle VPC for instance
Big Brother is born. And we find out 15 years too late to stop him - The Register
During the passage of RIPA, and in many debates since 2000, Parliament was asked to consider and require data retention by telephone companies, claiming that the information was vital to fighting crime and terrorism. But Prime Minister Tony Blair and successive Home Secretaries David Blunkett and Jack Straw never revealed to Parliament that at the same time, the government was constantly siphoning up and storing all telephone call records at NTAC. As a result, MPs and peers spent months arguing about a pretence, and in ignorance of the cost and human rights implications of what successive governments were doing in secret.
(tags: ripa big-brother surveillance preston uk gchq mi5 law snooping)
How to host Hugo static website generator on AWS Lambda
seriously, AWS. editing JSON files in a browser text box is an awful, awful user experience
-
A German bank offering a worldwide(?) bank account, using your smartphone (with push notifications etc.) as the main UI
The mystery of the power bank phone taking over Ghana
tl;dr: it's being used as a cheap, portable power bank
(tags: africa ghana battery phones power recharging gadgets)
Gardai find 70 stolen bikes in one house being readied for export
The Limerick Leader quoted other unnamed gardai who said they believed those who had stolen the bikes were selling them to a third party for shipment abroad, most likely to another country in Europe. “It would seem that he has his own network on the Continent and has a lucrative market for the bikes he sends on,” said one of the sources quoted in the report. “Some of the racing bikes would fetch large sums of money on the Continent.” Trucks were seen arriving and departing the house in Castletroy where the find was made. And while it was unclear exactly how gardai were informed of the suspicious activity, when a team of officers went to search the property they found the bikes in the back garden.
(tags: bikes theft limerick crime bike-theft ireland castletroy)
-
“Statistical regression to the mean predicts that patients selected for abnormalcy will, on the average, tend to improve. We argue that most improvements attributed to the placebo effect are actually instances of statistical regression.”
(tags: medicine science statistics placebo evidence via:hn regression-to-the-mean)
League of Legends win-rates vs latency analysed
It appears that more mechanically intensive champions are more affected by latency, while tankier champions or those with point-and-click abilities are less affected by latency.
(via Nelson)(tags: games league-of-legends latency ping gaming internet via:nelson)
-
via Tony Finch. 'In this post I will demonstrate how to do reservoir sampling orders of magnitude faster than the traditional “naive” reservoir sampling algorithm, using a fast high-fidelity approximation to the reservoir sampling-gap distribution.'
(tags: statistics reservoir-sampling sampling algorithms poisson bernoulli performance)
The Moral Failure of Computer Scientists - The Atlantic
Phillip Rogaway, a professor of CS at UC Davis, contends that computer scientists should stand up against the construction of surveillance states built using their work:
Waddell: In your paper, you compare the debate over nuclear science in the 1950s to the current debate over cryptography. Nuclear weapons are one of the most obvious threats to humanity today — do you think surveillance presents a similar type of danger? Rogaway: I do. It’s of a different nature, obviously. The threat is more indirect and more subtle. So with nuclear warfare, there was this visually compelling and frightening risk of going up in a mushroom cloud. And with the transition to a state of total surveillance, what we have is just the slow forfeiture of democracy.
(tags: ethics cryptography crypto surveillance politics phillip-rogaway morals speaking-out government)
-
This is basically terrifying. A catalog of race conditions and reliability horrors around the POSIX filesystem abstraction in Linux -- it's a wonder anything works. 'Where’s this documented? Oh, in some mailing list post 6-8 years ago (which makes it 12-14 years from today). The fs devs whose posts I’ve read are quite polite compared to LKML’s reputation, and they generously spend a lot of time responding to basic questions, but it’s hard for outsiders to troll [sic] through a decade and a half of mailing list postings to figure out which ones are still valid and which ones have been obsoleted! I don’t mean to pick on filesystem devs. In their OSDI 2014 talk, the authors of the paper we’re discussing noted that when they reported bugs they’d found, developers would often respond “POSIX doesn’t let filesystems do that”, without being able to point to any specific POSIX documentation to support their statement. If you’ve followed Kyle Kingsbury’s Jepsen work, this may sound familiar, except devs respond with “filesystems don’t do that” instead of “networks don’t do that”.I think this is understandable, given how much misinformation is out there. Not being a filesystem dev myself, I’d be a bit surprised if I don’t have at least one bug in this post.'
(tags: filesystems linux unix files operating-systems posix fsync osdi papers reliability)
[LUCENE-6917] Deprecate and rename NumericField/RangeQuery to LegacyNumeric - ASF JIRA
Interesting performance-related tweak going into Lucene -- based on the Bkd-Tree I think: https://users.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf . Being used for all numeric index types, not just multidimensional ones?
(tags: lucene performance algorithms patches bkd-trees geodata numeric indexing)
Kevin Lyda's mega pension post
Cutting and pasting from Facebook for posterity... there are some really solid tips in here. 'Some people plan their lives out and then there are people like me who randomly do things and suddenly, in retrospect, it looks like a grand plan has come together. In reality it's more like my subconscious pulls in useful info and pokes me to go learn things as required. If you live/work in Ireland, the following "grand plan" might be useful. This year has apparently been "figure out how to retire" year. It started late last year with finally organising all my private Irish pensions (2 from employers, 1 personal). In the process I learned the following: * Many Irish pension plans allow you to start drawing down from them at age 50. There are downsides to this, but if you have several of them it allows you more room to avoid stock market downturns when you purchase annuities. * You can get 25% of each pension as a tax-free lump sum. I also learned a few property things. The key thing is that if you have a buy-to-let property you should *not* pay off its mortgage early. You can deduct 75% of the interest you pay against the taxes you'd owe for rental income. That means the interest you pay will essentially be close to or even under the rate of inflation. A residential mortgage might have a lower interest rate nominally, but the effective interest rate is higher. The Irish state pension is changing. If you are 68 after 2020 the rules have changed - and they're now much simpler. Work for 10 years and you get the minimum state pension (1/3 of a full pension). Work for 20, you get 2/3 of of a state pension. Work for 30, you get a full pension. But you can't collect it till you're 68 and remember that Irish employers can apparently force you to "retire" at 65 (ageism is legal). So you need to bridge those 3 years (or hope they change the law to stop employers from doing that). When I "retired" I kept a part time job for a number of reasons, but one was because I suspected I needed more PRSI credits for a pension. And it turns out this was correct. Part-time work counts as long as you make more than €38/week. And self-employment counts as long as you make more than €5,000/year. You can also make voluntary PRSI contributions (around €500/year but very situation dependent). If you've worked in Europe or the US or Canada or a few other countries, you can get credits for social welfare payments in those countries. But if you have enough here and you have enough for some pension in the other country, you can draw a pension from both. Lastly most people I've talked to about retirement this year have used the analogy of legs on a stool. Every source of post-retirement income is a leg on the stool - the more legs, the more secure your retirement. There are lots of options for legs: * Rental income. This is a little wobbly as legs go at least for me. But if you have more than one rental property - and better yet some commercial rental property - this leg firms up a bit. Still, it's a bit more work than most. * Savings. This isn't very tax-efficient, but it can help fill in blank spots some legs have (like rental income or age restrictions) or maximise another legs value (weathering downturns for stock-based legs). And in retirement you can even build savings up. Sell a house, the private pension lump sum, etc. But remember you're retired, go have fun. Savings won't do you much good when you're dead. * Stocks. I've cashed all mine in, but some friends have been more restrained in cashing in stocks they might have gotten from employers. This is a volatile leg, but it can pay off rather well if you know what you're doing. But be honest with yourself. I know I absolutely don't know what I'm doing on this so stayed away. * Government pension. This is generally a reliable source of income in retirement. It's usually not a lot, but it does tend to last from retirement to death and it shows up every month. You apply once and then it just shows up each month. If you've worked in multiple countries, you can hedge some bets by taking a pension in each country you qualify from. You did pay into them after all. * Private pension. This can also give you a solid source of income but you need to pay into it. And paying in during your 20s and 30s really pays off later. But you need to make your investments less risky as you get into your late 50s - so make sure to start looking at them then. And you need to provide yourself some flexibility for starting to draw it down in order to survive market drops. The crash in 2007 didn't fully recover until 2012 - that's 5 years. * Your home. Pay off your mortgage and your home can be a leg. Not having to pay rent/mortgage is a large expense removed and makes the other legs more effective. You can also "sell down" or look into things like reverse mortgages, but the former can take time and has costs while the latter usually seems to have a lot of fine print you should read up on. * Part-time work. I know a number of people who took part-time jobs when they retired. If you can find something that doesn't take a huge amount of time that you'd enjoy doing and that people will pay you for, fantastic! Do that. And it gets you out of the house and keeping active. For friends who are geeks and in my age cohort, I note that it will be 2037 around the time we hit 65. If you know why that matters, ka-ching!' Another particularly useful page about the state pension: "Six things every woman needs to know about the State pension", Irish Times, Dec 1 2015, https://www.irishtimes.com/business/personal-finance/six-things-every-woman-needs-to-know-about-the-state-pension-1.2448981 , which links to this page to get your state pension contribution record: http://www.welfare.ie/en/pages/secure/ RequestSIContributionRecord.aspx
(tags: pensions money life via:klyda stocks savings shares property ireland old-age retirement)
-
As Glynn Moody noted, if UK police, intelligence agencies, HMRC and others call all legally hack phones and computers, that also means that digital evidence can be easily and invisibly planted. This will undermine future court cases in the UK, which seems like a significant own goal...
(tags: hmrc police gchq uk hacking security law-enforcement evidence law)
Why We Chose Kubernetes Over ECS
3 months ago when we, at nanit.com, came to evaluate which Docker orchestration framework to use, we gave ECS the first priority. We were already familiar with AWS services, and since we already had our whole infrastructure there, it was the default choice. After testing the service for a while we had the feeling it was not mature enough and missing some key features we needed (more on that later), so we went to test another orchestration framework: Kubernetes. We were glad to discover that Kubernetes is far more comprehensive and had almost all the features we required. For us, Kubernetes won ECS on ECS’s home court, which is AWS.
(tags: kubernetes ecs docker containers aws ec2 ops)
Beachbum Berry — Latitude 29 Formula Orgeat
The legendary Jeff "Beachbum" Berry, tiki-cocktail wizard, has partnered with a Brooklyn-based orgeat maker to provide the key ingredient for an original Trader-Vic-style Mai Tai. may be a bit tricky to ship to Ireland though!
How to Spot Bitcoin Inventor Satoshi Nakamoto | MIT Technology Review
Emin Gün Sirer pours cold water on the "Craig Wright is Satoshi Nakamoto" theory
(tags: satoshi-nakamoto bitcoin anonymous nom-de-guerre crypto)
Dr TJ McIntyre: Fight against cybercrime needs funding, not more words - Independent.ie
Is the Irish policing system capable of tackling computer crime? A report this week from the Garda Inspectorate makes it clear that the answer is no. There is no Garda cybercrime unit, which is of serious concern given the threat posed by cybercrime to key national infrastructure such as energy, transport and telecommunications systems. [...] A combination of inadequate resources and increased workload have swamped the [Computer Crime Investigation Unit]. Today, almost every crime is a computer crime, in the sense that mobile phones, laptops and even devices such as game consoles are likely to contain evidence. The need to forensically inspect all these devices - using outdated equipment - has resulted in several-year delays and seem to have forced the unit into a position where it is running to stand still rather than responding to new developments.
(tags: via:tjmcintyre ireland cybercrime law policing hacking)
-
I keep having to google this, so here's a good one which works -- unlike Wolfram Alpha!
(tags: birthday birthday-paradox birthday-problem hashes hash-collision attacks security collisions calculators probability statistcs)
-
'At least for Europe it is obvious: All roads lead to Rome! You can reach the eternal city on almost 500.000 routes from all across the continent. Which road would you take? To approach one of the biggest unsolved quests of mobility, the first question we asked ourselves was: Where do you start, when you want to know every road to Rome? We aligned starting points in a 26.503.452 km² grid covering all of Europe. Every cell of this grid contains the starting point to one of our journeys to Rome. Now that we have our 486.713 starting points we need to find out how we could reach Rome as our destination. For this we created a algorithm that calculates one route for every trip. The more often a single street segment is used, the stronger it is drawn on the map. The maps as outcome of this project is somewhere between information visualization and data art, unveiling mobility and a very large scale.' Beautiful! Decent-sized prints available for 26 euros too.
Tools for debugging, testing and using HTTP/2
excellent, extensive list from Cloudflare
(tags: http http2 cloudflare tools cli ops testing debugging spdy)
AWS Api Gateway for Fun and Profit
good worked-through example of an API Gateway rewriting system
(tags: api-gateway aws api http services ops alerting alarming opsgenie signalfx)
EU counter-terror bill is 'indiscriminate' data sweep
"To identify if someone is travelling outside the EU, we don't need an EU PNR. This data are already easily available in the airline reservation system,” [Giovanni Buttarelli, the European data protection supervisor] said. EU governments want more information in the belief it will help law enforcement in tracking down terrorists and are demanding access to information, such as travel dates, travel itinerary, ticket information, contact details, baggage information, and payment information of anyone flying in or out of the EU. ... EU PNR data would be retained for up to five years
(tags: pnr eu law privacy data-protection europe counter-terrorism travel air-travel)
Fast Forward Labs: Fashion Goes Deep: Data Science at Lyst
this is more than just data science really -- this is proper machine learning, with deep learning and a convolutional neural network. serious business
(tags: lyst machine-learning data-science ml neural-networks supervised-learning unsupervised-learning deep-learning)
Why Percentiles Don’t Work the Way you Think
Baron Schwartz on metrics, percentiles, and aggregation. +1, although as a HN commenter noted, quantile digests are probably the better fix
(tags: performance percentiles quantiles statistics metrics monitoring baron-schwartz vividcortex)
-
Spotify wrote their own metrics store on ElasticSearch and Cassandra. Sounds very similar to Prometheus
(tags: cassandra elasticsearch spotify monitoring metrics heroic)
ELS: latency based load balancer, part 1
ELS measures the following things: Success latency and success rate of each machine; Number of outstanding requests between the load balancer and each machine. These are the requests that have been sent out but we haven’t yet received a reply; Fast failures are better than slow failures, so we also measure failure latency for each machine. Since users care a lot about latency, we prefer machines that are expected to answer quicker. ELS therefore converts all the measured metrics into expected latency from the client’s perspective.[...] In short, the formula ensures that slower machines get less traffic and failing machines get much less traffic. Slower and failing machines still get some traffic, because we need to be able to detect when they come back up again.
(tags: latency spotify proxies load-balancing els algorithms c3 round-robin load-balancers routing)
Low-latency journalling file write latency on Linux
great research from LMAX: xfs/ext4 are the best choices, and they explain why in detail, referring to the code
(tags: linux xfs ext3 ext4 filesystems lmax performance latency journalling ops)
-
nice 3D printed maps from this Irish company
-
"Irish police have no cybercrime unit, and 1/3 of police have no email." ffs!
(tags: cybercrime policing ireland gardai fraud privacy phishing hacking internet law)
A Gulp Workflow for Amazon Lambda
'any nontrivial development of Lambda functions will require a simple, automated build/deploy process that also fills a couple of Lambda’s gaps such as the use of node modules and environment variables.' See also https://medium.com/@AdamRNeary/developing-and-testing-amazon-lambda-functions-e590fac85df4#.mz0a4qk3j : 'I am psyched about Amazon’s new Lambda service for asynchronous task processing, but the ideal development and testing cycle is really left to the engineer. While Amazon provides a web-based console, I prefer an approach that uses Mocha. Below you will find the gritty details using Kinesis events as a sample input.'
(tags: lambda aws services testing deployment ops mocha gulp javascript)
"Hidden Technical Debt in Machine-Learning Systems" [pdf]
Another great paper about from Google, talking about the tradeoffs that must be considered in practice over the long term with running a complex ML system in production.
(tags: technical-debt ml machine-learning ops software production papers pdf google)
Introducing Netty-HTTP from Cask
netty-http library solves [Netty usability issues] by using JAX-RS annotations to build a HTTP path routing layer on top of netty. In addition, the library implements a guava service to manage the HTTP service. netty-http allows users of the library to just focus on writing the business logic in HTTP handlers without having to worry about the complexities of path routing or learning netty pipeline internals to build the HTTP service.
We've written something very similar, although I didn't even bother supporting JAX-RS annotations -- just a simple code-level DSL.The Locals Xmas Gift Guide 2015
some nice local gift suggestions from small businesses around Dublin. I'd love to get some of these, but I guess I'll have to settle for giving them instead ;)
(tags: gifts dublin ireland shopping xmas christmas the-locals)
Topics in High-Performance Messaging
'We have worked together in the field of high-performance messaging for many years, and in that time, have seen some messaging systems that worked well and some that didn't. Successful deployment of a messaging system requires background information that is not easily available; most of what we know, we had to learn in the school of hard knocks. To save others a knock or two, we have collected here the essential background information and commentary on some of the issues involved in successful deployments. This information is organized as a series of topics around which there seems to be confusion or uncertainty. Please contact us if you have questions or comments.'
(tags: messaging scalability scaling performance udp tcp protocols multicast latency)
Intercom Engineering Insights - Scale and Reliability 2015
next Intercom hiring^Wevent coming up, Dec 10th in Dublin, talking about how they scale and ops their ElasticSearch and Mongo clusters
(tags: elasticsearch mongodb intercom engineering talks dublin)
Control theory meets machine learning
'DB: Is there a difference between how control theorists and machine learning researchers think about robustness and error? BR: In machine learning, we almost always model our errors as being random rather than worst-case. In some sense, random errors are actually much more benign than worst-case errors. [...] In machine learning, by assuming average-case performance, rather than worst-case, we can design predictive algorithms by averaging out the errors over large data sets. We want to be robust to fluctuations in the data, but only on average. This is much less restrictive than the worst-case restrictions in controls. DB: So control theory is model-based and concerned with worst case. Machine learning is data based and concerned with average case. Is there a middle ground? BR: I think there is! And I think there's an exciting opportunity here to understand how to combine robust control and reinforcement learning. Being able to build systems from data alone simplifies the engineering process, and has had several recent promising results. Guaranteeing that these systems won't behave catastrophically will enable us to actually deploy machine learning systems in a variety of applications with major impacts on our lives. It might enable safe autonomous vehicles that can navigate complex terrains. Or could assist us in diagnostics and treatments in health care. There are a lot of exciting possibilities, and that's why I'm excited about how to find a bridge between these two viewpoints.'
(tags: control-theory interviews machine-learning ml worst-case self-driving-cars cs)
-
This is my bet: the age of dynamic languages is over. There will be no new successful ones. Indeed we have learned a lot from them. We’ve learned that library code should be extendable by the programmer (mixins and meta-programming), that we want to control the structure (macros), that we disdain verbosity. And above all, we’ve learned that we want our languages to be enjoyable. But it’s time to move on. We will see a flourishing of languages that feel like you’re writing in a Clojure, but typed. Included will be a suite of powerful tools that we’ve never seen before, tools so convincing that only ascetics will ignore.
(tags: programming scala clojure coding types strong-types dynamic-languages languages)
-
'IRC without netsplits' using Raft consensus
(tags: raft irc netsplits resilience fault-tolerance)
Inside China's Memefacturing Factories, Where The Hottest New Gadgets Are Made - BuzzFeed News
On a humid afternoon, Zhou went shopping for some of those very parts at a Bao An market. As he pulled his maroon minivan into a crowded parking lot, the full scale of Depu Electronics came into view: a three-story concrete behemoth roughly bigger than a Costco and roughly smaller than the Pentagon. Inside, it looked like the world’s largest Radio Shack going out of business sale: an endless series of booths with cables and circuit boards and plugs and ports and buttons and machines piled so high on tables that the faces of the clerks who were selling them were hidden from view. Each booth seemed to argue: We have exactly what you want and we have enough of it for all of your customers. Short of motorized wheels and molding, the market offered nearly everything an ambitious factory owner would need to build a hoverboard, just waiting to be bought, assembled, and shipped.
(tags: hoverboards memes china manufacturing future gadgets tat bao-an electronics)
One of the Largest Hacks Yet Exposes Data on Hundreds of Thousands of Kids | Motherboard
VTech got hacked, and millions of parents and 200,000 kids had their privacy breached as a result. Bottom line is summed up by this quote from one affected parent:
“Why do you need know my address, why do you need to know all this information just so I can download a couple of free books for my kid on this silly pad thing? Why did they have all this information?”
Quite. Better off simply not to have the data in the first place!(tags: vtech privacy data-protection data hacks)
Senior Anglo bondholders revealed in department note
In case you were wondering who Ireland's economy was wiped out for:
Among the major holders were a Dutch pension fund, ABP; another Dutch fund, PGGM; LGPI in Finland, which manages local government pensions; and a Swiss public entities pension. A number of major asset managers were also named, including JP Morgan in London; DeKA and ADIG, two German investment managers; and Robeco from the Netherlands. Big insurance companies, including Munich Re, Llmarinen from Finland and German giant Axa were also named, along with big banks such as BNP, SocGen, ING and Deutsche.
(tags: bondholders anglo economy ireland politics eu senior-bondholders)
-
a bunch of metrics for Dublin xmas-shopping capacity
re:Work - The five keys to a successful Google team
We learned that there are five key dynamics that set successful teams apart from other teams at Google: Psychological safety: Can we take risks on this team without feeling insecure or embarrassed? Dependability: Can we count on each other to do high quality work on time? Structure & clarity: Are goals, roles, and execution plans on our team clear? Meaning of work: Are we working on something that is personally important for each of us? Impact of work: Do we fundamentally believe that the work we’re doing matters?
(tags: teams google culture work management productivity hr)
-
75%. This is really quite tricky!
Art Meets Cartography: The 15,000-Year History of a River in Oregon Rendered in Data
this is really beautiful. Available as a printable, 17" x 38" PDF from http://www.oregongeology.org/pubs/ll/p-poster-willamette.htm
(tags: art data mapping geodata oregon rivers willamette-river history lidar)
Accretion Disc Series - Clint Fulkerson
available as prints -- vector art with a hint of the bacterial
(tags: algorithms art graphics vector bacteria petri-dish clint-fulkerson)
John Nagle on delayed ACKs and his algorithm
love it when things like this show up
(tags: networking performance scalability nagle tcp ip)
-
She thought they were a normal couple until she found a passport in a glovebox – and then her world shattered. Now she is finally getting compensation and a police apology for that surreal, state-sponsored deception. But she still lies awake and wonders: did he ever really love me?
I can't believe this was going on in the 2000s!(tags: surveillance police uk undercover scandals policing environmentalism greens)
Just use /dev/urandom to generate random numbers
Using SHA-1 [to generate random numbers] in this way, with a random seed and a counter, is just building a (perfectly sound) CSPRNG with, I believe, an 80-bit security level. If you trust the source of the random seed, e.g. /dev/urandom, you may as well just use /dev/urandom itself. If you don't, you're already in trouble. And if you somehow need a userspace PRNG, the usual advice about not rolling your own crypto unless you know what you're doing applies. (Especially for database IDs, the risk of collisions should be considered a security problem, ergo this should be considered crypto, until proven otherwise.) In this case, using BLAKE2 instead of SHA-1 would get you a higher security level and faster hashing. Or, in tptacek's words: http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/
(tags: random randomness urandom uuids tptacek hackernews prng)
Authenticated app packages on Sandstorm with PGP and Keybase
Nice approach to package authentication UX using Keybase/PGP.
When you go to install a package, Sandstorm verifies that the package is correctly signed by the Ed25519 key. It looks for a PGP signature in the metadata, and verifies that the PGP-signed assertion is for the correct app ID and the email address specified in the metadata. It queries the Keybase API to see what accounts the packager has proven ownership of, and lists them with their links on the app install page.
(tags: authentication auth packages sandstorm keybase pgp gpg security)
-
Floating car data (FCD), also known as floating cellular data, is a method to determine the traffic speed on the road network. It is based on the collection of localization data, speed, direction of travel and time information from mobile phones in vehicles that are being driven. These data are the essential source for traffic information and for most intelligent transportation systems (ITS). This means that every vehicle with an active mobile phone acts as a sensor for the road network. Based on these data, traffic congestion can be identified, travel times can be calculated, and traffic reports can be rapidly generated. In contrast to traffic cameras, number plate recognition systems, and induction loops embedded in the roadway, no additional hardware on the road network is necessary.
(tags: surveillance cars driving mobile-phones phones travel gsm monitoring anpr alpr traffic)
CiteSeerX — The Confounding Effect of Class Size on the Validity of Object-oriented Metrics
A lovely cite from @conor. Turns out the sheer size of an OO class is itself a solid fault-proneness metric
(tags: metrics coding static-analysis error-detection faults via:conor oo)
How a group of neighbors created their own Internet service | Ars Technica
Orcas Island, WA. impressive stuff
(tags: community diy internet wa wireless networking orcas-island)
Report: Everyone Should Get a Security Freeze
“Whether your personal information has been stolen or not, your best protection against someone opening new credit accounts in your name is the security freeze (also known as the credit freeze), not the often-offered, under-achieving credit monitoring. Paid credit monitoring services in particular are not necessary because federal law requires each of the three major credit bureaus to provide a free credit report every year to all customers who request one. You can use those free reports as a form of do-it-yourself credit monitoring.”
(tags: us credit credit-freeze security phishing brian-krebs)
Even the LastPass Will be Stolen, Deal with It
ugh, quite a long list of LastPass security issues
(tags: lastpass hacking security via:securitay exploits passwords)
Signs Point to Unencrypted Communications Between Terror Suspects
News emerging from Paris — as well as evidence from a Belgian ISIS raid in January — suggests that the ISIS terror networks involved were communicating in the clear, and that the data on their smartphones was not encrypted.
(tags: paris terrorism crypto via:schneier isis smartphones)
Global Continuous Delivery with Spinnaker
Netflix' CD platform, post-Atlas. looks interesting
(tags: continuous-delivery aws netflix cd devops ops atlas spinnaker)
-
by reordering items to optimize locality. Via aphyr's dad!
(tags: caches cache-friendly optimization data-locality performance coding algorithms)
Reporting Error Leads To Speculation That Terrorists Used PS4s To Plan Paris Attacks
lol. Nice work, Forbes
(tags: forbes fail ps4 crypto terrorism reporting msm speculation hysteria)
Did you know that Dublin Airport is recording your phone's data? - Newstalk
Ugh. Queue tracking using secret MAC address tracking in Dublin Airport:
"I think the fundamental issue is one of consent. Dublin Airport have been tracking individual MAC addresses since 2012 and there doesn't appear to be anywhere in the airport where they warn passengers that this is this occurring. "If they have to signpost CCTV, then mobile phone tracking should at a very minimum be sign-posted for passengers," he continues.
And how long are MAC addresses retained for, I wonder?(tags: mac-addresses dublin-airport travel privacy surveillance tracking wifi phones cctv consent)
Pinboard on the Next Economy Conference (with tweets)
Maciej Ceglowski went to an O'Reilly SV-boosterish conference and produced these excellent tweets
(tags: twitter conferences oreilly silicon-valley new-economy future lyft uber unions maciej-ceglowski)
Our Generation Ships Will Sink / Boing Boing
Kim Stanley Robinson on the feasibility of interstellar colonization: 'There is no Planet B! Earth is our only possible home!'
(tags: earth future kim-stanley-robinson sf space)
The impact of Docker containers on the performance of genomic pipelines [PeerJ]
In this paper, we have assessed the impact of Docker containers technology on the performance of genomic pipelines, showing that container “virtualization” has a negligible overhead on pipeline performance when it is composed of medium/long running tasks, which is the most common scenario in computational genomic pipelines. Interestingly for these tasks the observed standard deviation is smaller when running with Docker. This suggests that the execution with containers is more “homogeneous,” presumably due to the isolation provided by the container environment. The performance degradation is more significant for pipelines where most of the tasks have a fine or very fine granularity (a few seconds or milliseconds). In this case, the container instantiation time, though small, cannot be ignored and produces a perceptible loss of performance.
(tags: performance docker ops genomics papers)
Three quarters of cars stolen in France 'electronically hacked' - Telegraph
The astonishing figures come two months after computer scientists in the UK warned that thousands of cars – including high-end brands such as Porsches and Maseratis - are at risk of electronic hacking. Their research was suppressed for two years by a court injunction for fear it would help thieves steal vehicles to order. The kit required to carry out such “mouse jacking”, as the French have coined the practice, can be freely purchased on the internet for around £700 and the theft of a range of models can be pulled off “within minutes,” motor experts warn.
(tags: hacking security security-through-obscurity mouse-jacking cars safety theft crime france smart-cars)
-
Awesome new mock DynamoDB implementation:
An implementation of Amazon's DynamoDB, focussed on correctness and performance, and built on LevelDB (well, @rvagg's awesome LevelUP to be precise). This project aims to match the live DynamoDB instances as closely as possible (and is tested against them in various regions), including all limits and error messages. Why not Amazon's DynamoDB Local? Because it's too buggy! And it differs too much from the live instances in a number of key areas.
We use DynamoDBLocal in our tests -- the availability of that tool is one of the key reasons we have adopted Dynamo so heavily, since we can safely test our code properly with it. This looks even better.(tags: dynamodb testing unit-tests integration-testing tests ops dynalite aws leveldb)
Alarm design: From nuclear power to WebOps
Imagine you are an operator in a nuclear power control room. An accident has started to unfold. During the first few minutes, more than 100 alarms go off, and there is no system for suppressing the unimportant signals so that you can concentrate on the significant alarms. Information is not presented clearly; for example, although the pressure and temperature within the reactor coolant system are shown, there is no direct indication that the combination of pressure and temperature mean that the cooling water is turning into steam. There are over 50 alarms lit in the control room, and the computer printer registering alarms is running more than 2 hours behind the events. This was the basic scenario facing the control room operators during the Three Mile Island (TMI) partial nuclear meltdown in 1979. The Report of the President’s Commission stated that, “Overall, little attention had been paid to the interaction between human beings and machines under the rapidly changing and confusing circumstances of an accident” (p. 11). The TMI control room operator on the day, Craig Faust, recalled for the Commission his reaction to the incessant alarms: “I would have liked to have thrown away the alarm panel. It wasn’t giving us any useful information”. It was the first major illustration of the alarm problem, and the accident triggered a flurry of human factors/ergonomics (HF/E) activity.
A familiar topic for this ex-member of the Amazon network monitoring team...(tags: ergonomics human-factors ui ux alarms alerts alerting three-mile-island nuclear-power safety outages ops)
An Analysis of Reshipping Mule Scams
We observed that the vast majority of the re-shipped packages end up in the Moscow, Russia area, and that the goods purchased with stolen credit cards span multiple categories, from expensive electronics such as Apple products, to designer clothes, to DSLR cameras and even weapon accessories. Given the amount of goods shipped by the reshipping mule sites that we analysed, the annual revenue generated from such operations can span between 1.8 and 7.3 million US dollars. The overall losses are much higher though: the online merchant loses an expensive item from its inventory and typically has to refund the owner of the stolen credit card. In addition, the rogue goods typically travel labeled as “second hand goods” and therefore custom taxes are also evaded. Once the items purchased with stolen credit cards reach their destination they will be sold on the black market by cybercriminals. [...] When applying for the job, people are usually required to send the operator copies of their ID cards and passport. After they are hired, mules are promised to be paid at the end of their first month of employment. However, from our data it is clear that mules are usually never paid. After their first month expires, they are never contacted back by the operator, who just moves on and hires new mules. In other words, the mules become victims of this scam themselves, by never seeing a penny. Moreover, because they sent copies of their documents to the criminals, mules can potentially become victims of identity theft.
(tags: crime law cybercrime mules shipping-scams identity-theft russia moscow scams papers)
No Harm, No Fowl: Chicken Farm Inappropriate Choice for Data Disposal
That’s a lesson that Spruce Manor Special Care Home in Saskatchewan had to learn the hard way (as surprising as that might sound). As a trustee with custody of personal health information, Spruce Manor was required under section 17(2) of the Saskatchewan Health Information Protection Act to dispose of its patient records in a way that protected patient privacy. So, when Spruce Manor chose a chicken farm for the job, it found itself the subject of an investigation by the Saskatchewan Information and Privacy Commissioner. In what is probably one of the least surprising findings ever, the commissioner wrote in his final report that “I recommend that Spruce Manor […] no longer use [a] chicken farm to destroy records”, and then for good measure added “I find using a chicken farm to destroy records unacceptable.”
(tags: data law privacy funny chickens farming via:pinboard data-protection health medical-records)
Caffeine cache adopts Window TinyLfu eviction policy
'Caffeine is a Java 8 rewrite of Guava's cache. In this version we focused on improving the hit rate by evaluating alternatives to the classic least-recenty-used (LRU) eviction policy. In collaboration with researchers at Israel's Technion, we developed a new algorithm that matches or exceeds the hit rate of the best alternatives (ARC, LIRS). A paper of our work is being prepared for publication.' Specifically:
W-TinyLfu uses a small admission LRU that evicts to a large Segmented LRU if accepted by the TinyLfu admission policy. TinyLfu relies on a frequency sketch to probabilistically estimate the historic usage of an entry. The window allows the policy to have a high hit rate when entries exhibit a high temporal / low frequency access pattern which would otherwise be rejected. The configuration enables the cache to estimate the frequency and recency of an entry with low overhead. This implementation uses a 4-bit CountMinSketch, growing at 8 bytes per cache entry to be accurate. Unlike ARC and LIRS, this policy does not retain non-resident keys.
(tags: tinylfu caches caching cache-eviction java8 guava caffeine lru count-min sketching algorithms)
-
The ever-shitty Java serialization creates a security hole
(tags: java serialization security exploits jenkins)
-
Danish glassware artist making wonderful Wunderkammers -- cabinets of curiosities --- entirely from glass. Seeing as one of his works sold for UKP50,000 last year, I suspect these are a bit out of my league, sadly
(tags: art glassware steffen-dam wunderkammers museums)
London garden bridge users to have mobile phone signals tracked
If it goes ahead, people’s progress across the structure would be tracked by monitors detecting the Wi-Fi signals from their phones, which show up the device’s Mac address, or unique identifying code. The Garden Bridge Trust says it will not store any of this data and is only tracking phones to count numbers and prevent overcrowding.
(tags: london surveillance mobile-phones mac-trackers tracking)
Red lines and no-go zones - the coming surveillance debate
The Anderson Report to the House of Lords in the UK on RIPA introduces a concept of a "red line":
"Firm limits must also be written into the law: not merely safeguards, but red lines that may not be crossed." … "Some might find comfort in a world in which our every interaction and movement could be recorded, viewed in real time and indefinitely retained for possible future use by the authorities. Crime fighting, security, safety or public health justifications are never hard to find." [13.19] The Report then gives examples, such as a perpetual video feed from every room in every house, the police undertaking to view the record only on receipt of a complaint; blanket drone-based surveillance; licensed service providers, required as a condition of the licence to retain within the jurisdiction a complete plain-text version of every communication to be made available to the authorities on request; a constant data feed from vehicles, domestic appliances and health-monitoring personal devices; fitting of facial recognition software to every CCTV camera and the insertion of a location-tracking chip under every individual's skin. It goes on: "The impact of such powers on the innocent could be mitigated by the usual apparatus of safeguards, regulators and Codes of Practice. But a country constructed on such a basis would surely be intolerable to many of its inhabitants. A state that enjoyed all those powers would be truly totalitarian, even if the authorities had the best interests of its people at heart." [13.20] … "The crucial objection is that of principle. Such a society would have gone beyond Bentham's Panopticon (whose inmates did not know they were being watched) into a world where constant surveillance was a certainty and quiescence the inevitable result. There must surely come a point (though it comes at different places for different people) where the escalation of intrusive powers becomes too high a price to pay for a safer and more law abiding environment." [13.21]
(tags: panopticon jeremy-bentham law uk dripa ripa surveillance spying police drones facial-recognition future tracking cctv crime)
Dublin is a medium-density city
Comparable to Copenhagen or Amsterdam, albeit without sufficient cycling/public-transport infrastructural investment
(tags: infrastructure density housing dublin ireland cities travel commuting cycling)
-
I'm tired of this shit. Full stop tired. It's 2015 and these turds who grope their way around conferences and the like can make allegations like this, get a hand wave and an, "Oh, that's just crazy Raymond!" Fuck that. Fuck it from here to hell and back. Here's a man who really hasn't done anything all that special, is a totally crazy gun-toting misogynist of the highest order and, yet, he remains mostly unchallenged after the tempest dies down, time after time. [...] I'm sure ESR will still be haunting conferences when your daughters reach their professional years unless you get serious about outing the assholes like him and making the community a lot less toxic than it is now.?
Amen to that.(tags: esr toxic harassment conferences sexism misogyny culture)
User data plundering by Android and iOS apps is as rampant as you suspected
An app from Drugs.com, meanwhile, sent the medical search terms "herpes" and "interferon" to five domains, including doubleclick.net, googlesyndication.com, intellitxt.com, quantserve.com, and scorecardresearch.com, although those domains didn't receive other personal information.
(tags: privacy security google tracking mobile phones search pii)
Volkswagen emissions cheating was technical debt
Is this the first case of tech debt costing $18 billion?
"Perhaps the engineers told themselves that the cheat was a stopgap, and they’d address it later. If so, they didn’t."
(tags: tech-debt vw volkswagen management prioritisation planning)
Nobody Loves Graphite Anymore - VividCortex
Graphite has a place in our current monitoring stack, and together with StatsD will always have a special place in the hearts of DevOps practitioners everywhere, but it’s not representative of state-of-the-art in the last few years. Graphite is where the puck was in 2010. If you’re skating there, you’re missing the benefits of modern monitoring infrastructure. The future I foresee is one where time series capabilities (the raw power needed, which I described in my time series requirements blog post, for example) are within everyone’s reach. That will be considered table stakes, whereas now it’s pretty revolutionary.
Like I've been saying -- we need Time Series As A Service! This should be undifferentiated heavy lifting.(tags: graphite tsd time-series vividcortex statsd ops monitoring metrics)
-
PICO-8 is a fantasy console for making, sharing and playing tiny games and other computer programs. When you turn it on, the machine greets you with a shell for typing in Lua commands and provides simple built-in tools for creating your own cartridges.
So cute! See also Voxatron, something similar for voxel-oriented 3D gaming Why Static Website Generators Are The Next Big Thing
Now _this_ makes me feel old. Alternative title: "why static website generators have been a good idea since WebMake, 15 years ago". WebMake does pretty well on the checklist of "key features of the modern static website generator", which are: 1. Templating (check); 2. Markdown support (well, EtText, which predated Markdown by several years); 3. Metadata (check); and 4. Javascript asset pipeline (didn't support this one, since complex front-end DHTML JS wasn't really a thing at the turn of the century. But I would have if it had ;). So I guess I was on the right track!
(tags: web html history webmake static-sites bake-dont-fry site-generators cms)
Food Trucks Are Great Incubators. Why Don't We Have More?
So is that kind of thriving food-truck scene something the city should work to encourage? Theresa Hernandez, one of the owners of K Chido Mexico, thinks so. “There’s a whole market there for a new culture,” she says. “There’s no doubt about it, the appetite is there. It’s just a matter for somebody who is innovative enough in Dublin City Council to say: ‘Right, let’s do this.’”
Amen to that.wangle/Codel.h at master · facebook/wangle
Facebook's open-source implementation of the CoDel queue management algorithm applied to server request-handling capacity in their C++ service bootstrap library, Wangle.
(tags: wangle facebook codel services capacity reliability queueing)
-
Despite its overarching abstractions, it is semantically non-uniform and its complicated transaction and job scheduling heuristics ordered around a dependently networked object system create pathological failure cases with little debugging context that would otherwise not necessarily occur on systems with less layers of indirection. The use of bus APIs complicate communication with the service manager and lead to duplication of the object model for little gain. Further, the unit file options often carry implicit state or are not sufficiently expressive. There is an imbalance with regards to features of an eager service manager and that of a lazy loading service manager, having rusty edge cases of both with non-generic, manager-specific facilities. The approach to logging and the circularly dependent architecture seem to imply that lots of prior art has been ignored or understudied.
(tags: analysis systemd linux unix ops init critiques software logging)
-
Great paper from Ben Maurer of Facebook in ACM Queue.
A "move-fast" mentality does not have to be at odds with reliability. To make these philosophies compatible, Facebook's infrastructure provides safety valves.
This is full of interesting techniques. * Rapidly deployed configuration changes: Make everybody use a common configuration system; Statically validate configuration changes; Run a canary; Hold on to good configurations; Make it easy to revert. * Hard dependencies on core services: Cache data from core services. Provide hardened APIs. Run fire drills. * Increased latency and resource exhaustion: Controlled Delay (based on the anti-bufferbloat CoDel algorithm -- this is really cool); Adaptive LIFO (last-in, first-out) for queue busting; Concurrency Control (essentially a form of circuit breaker). * Tools that Help Diagnose Failures: High-Density Dashboards with Cubism (horizon charts); What just changed? * Learning from Failure: the DERP (!) methodology,(tags: ben-maurer facebook reliability algorithms codel circuit-breakers derp failure ops cubism horizon-charts charts dependencies soa microservices uptime deployment configuration change-management)
Tesla Autopilot mode is learning
This is really impressive, but also a little scary. Drivers driving the Tesla Model S are "phoning home" training data as they drive:
A Model S owner by the username Khatsalano kept a count of how many times he had to “rescue” (meaning taking control after an alert) his Model S while using the Autopilot on his daily commute. He counted 6 “rescues” on his first day, by the fourth day of using the system on his 23.5 miles commute, he only had to take control over once. Musk said that Model S owners could add ~1 million miles of new data every day, which is helping the company create “high precision maps”.
Wonder if the data protection/privacy implications have been considered for EU use.(tags: autopilot tesla maps mapping training machine-learning eu privacy data-protection)
-
For requesting a copy of an article that was legally obtained by a colleague from a paywalled source, Pazsowski found himself hit with around US$10,000-worth of damages. This completely disproportionate punishment for what is at most a minor case of copyright infringement is a perfect demonstration of where the anti-circumvention madness leads.
(tags: circumvention tpm copyright paywalls techdirt law canada)
-
Add another one to the "yay for DST" pile. (also yay for AWS using PST/PDT as default internal timezone instead of UTC...)
(tags: utc timezones fail bugs aws aws-cli dst daylight-savings time)
Google Cloud Platform HTTP/HTTPS Load Balancing
GCE's LB product is pretty nice -- HTTP/2 support, and a built-in URL mapping feature (presumably based on how Google approach that problem internally, I understand they take that approach). I'm hoping AWS are taking notes for the next generation of ELB, if that ever happens
(tags: elb gce google load-balancing http https spdy http2 urls request-routing ops architecture cloud)
It's an Emulator, Not a Petting Zoo: Emu and Lambda
a Lambda emulator in Python, suitable for unit testing lambdas
(tags: lambda aws coding unit-tests dev)
Google tears Symantec a new one on its CA failure
Symantec are getting a crash course in how to conduct an incident post-mortem to boot:
More immediately, we are requesting of Symantec that they further update their public incident report with: A post-mortem analysis that details why they did not detect the additional certificates that we found. Details of each of the failures to uphold the relevant Baseline Requirements and EV Guidelines and what they believe the individual root cause was for each failure. We are also requesting that Symantec provide us with a detailed set of steps they will take to correct and prevent each of the identified failures, as well as a timeline for when they expect to complete such work. Symantec may consider this latter information to be confidential and so we are not requesting that this be made public.
(tags: google symantec ev ssl certificates ca security postmortems ops)
Google is Maven Central's New Best Friend
google now mirroring Maven Central.
(tags: google maven maven-central jars hosting java packages build)
Apache Kafka, Purgatory, and Hierarchical Timing Wheels
In the new design, we use Hierarchical Timing Wheels for the timeout timer and DelayQueue of timer buckets to advance the clock on demand. Completed requests are removed from the timer queue immediately with O(1) cost. The buckets remain in the delay queue, however, the number of buckets is bounded. And, in a healthy system, most of the requests are satisfied before timeout, and many of the buckets become empty before pulled out of the delay queue. Thus, the timer should rarely have the buckets of the lower interval. The advantage of this design is that the number of requests in the timer queue is the number of pending requests exactly at any time. This allows us to estimate the number of requests need to be purged. We can avoid unnecessary purge operation of the watcher lists. As the result we achieve a higher scalability in terms of request rate with much better CPU usage.
(tags: algorithms timers kafka scheduling timing-wheels delayqueue queueing)
Open-sourcing PalDB, a lightweight companion for storing side data
a new LinkedIn open source data store, for write-once/read-mainly side data, java, Apache licensed. RocksDB discussion: https://www.facebook.com/groups/rocksdb.dev/permalink/834956096602906/
(tags: linkedin open-source storage side-data data config paldb java apache databases)
Twins denied driver’s permit because DMV can’t tell them apart
"The computer can recognize faces, a feature that comes in handy if somebody’s is trying to get an illegal ID. It apparently is not programmed to detect twins." As Hilary Mason put it: "You do not want to be an edge case in this future we are building."
(tags: future grim bugs twins edge-cases coding fail dmv software via:hmason)
The Okinawa missiles of October | Bulletin of the Atomic Scientists
'By Bordne's account, at the height of the Cuban Missile Crisis, Air Force crews on Okinawa were ordered to launch 32 missiles, each carrying a large nuclear warhead. Only caution and the common sense and decisive action of the line personnel receiving those orders prevented the launches—and averted the nuclear war that most likely would have ensued.'
(tags: okinawa nukes launch-codes pal cold-war cuban-missile-crisis history accidents ui security horror via:mattblaze)
Amazon ECS CLI Tutorial - Amazon EC2 Container Service
super-basic ECS tutorial, using a docker-compose.yml to create a new ECS-managed service fleet
Net neutrality: EU votes in favour of Internet fast lanes and slow lanes | Ars Technica UK
:(
In the end, sheer political fatigue may have played a major part in undermining net neutrality in the EU. However, the battle is not quite over. As Anne Jellema, CEO of the Web Foundation, which was established by Berners-Lee in 2009, notes in her response to today's EU vote: "The European Parliament is essentially tossing a hot potato to the Body of European Regulators, national regulators and the courts, who will have to decide how these spectacularly unclear rules will be implemented. The onus is now on these groups to heed the call of hundreds of thousands of concerned citizens and prevent a two-speed Internet."
Analysing user behaviour - from histograms to random forests (PyData) at PyCon Ireland 2015 | Lanyrd
Swrve's own Dave Brodigan on game user-data analysis techniques:
The goal is to give the audience a roadmap for analysing user data using python friendly tools. I will touch on many aspects of the data science pipeline from data cleansing to building predictive data products at scale. I will start gently with pandas and dataframes and then discuss some machine learning techniques like kmeans and random forests in scikitlearn and then introduce Spark for doing it at scale. I will focus more on the use cases rather than detailed implementation. The talk will be informed by my experience and focus on user behaviour in games and mobile apps.
(tags: swrve talks user-data big-data spark hadoop machine-learning data-science)
-
fast, modern, zero-conf load balancing HTTP(S) router managed by consul; serves 15k reqs/sec, in Go, from eBay
(tags: load-balancing consul http https routing ebay go open-source fabio)
-
pretty conventional HTTP/1.1, WebSockets and HTTP/2 front-end services with modern Netty practices
RentTheRunway's Engineering Ladder
One of the best things about working at Amazon was having a clear, well-defined career progression, and it's something that's always been absent in startups. Career growth, levelling, and tech management is important, and also helps in hiring by providing clear levels. This is the RentTheRunway engineering ladder, Camille Fournier's team, which they open sourced back in March 2015
(tags: engineering hiring management career renttherunway camille-fournier amazon startups career-growth levelling ladder)
How a criminal ring defeated the secure chip-and-PIN credit cards | Ars Technica
Ingenious --
The stolen cards were still considered evidence, so the researchers couldn’t do a full tear-down or run any tests that would alter the data on the card, so they used X-ray scans to look at where the chip cards had been tampered with. They also analyzed the way the chips distributed electricity when in use and used read-only programs to see what information the cards sent to a Point of Sale (POS) terminal. According to the paper, the fraudsters were able to perform a man-in-the-middle attack by programming a second hobbyist chip called a FUN card to accept any PIN entry, and soldering that chip onto the card’s original chip. This increased the thickness of the chip from 0.4mm to 0.7mm, "making insertion into a PoS somewhat uneasy but perfectly feasible,” the researchers write. [....] The researchers explain that a typical EMV transaction involves three steps: card authentication, cardholder verification, and then transaction authorization. During a transaction using one of the altered cards, the original chip was allowed to respond with the card authentication as normal. Then, during card holder authentication, the POS system would ask for a user’s PIN, the thief would respond with any PIN, and the FUN card would step in and send the POS the code indicating that it was ok to proceed with the transaction because the PIN checked out. During the final transaction authentication phase, the FUN card would relay the transaction data between the POS and the original chip, sending the issuing bank an authorization request cryptogram which the card issuer uses to tell the POS system whether to accept the transaction or not.
(tags: security chip-and-pin hacking pos emv transactions credit-cards debit-cards hardware chips pin fun-cards smartcards)
How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code
using Spark, Tesseract, HBase, Solr and Leptonica. Actually pretty feasible
(tags: spark tesseract hbase solr leptonica pdfs scanning cloudera hadoop architecture)
Existential Consistency: Measuring and Understanding Consistency at Facebook
The metric is termed ?(P)-consistency, and is actually very simple. A read for the same data is sent to all replicas in P, and ?(P)-consistency is defined as the frequency with which that read returns the same result from all replicas. ?(G)-consistency applies this metric globally, and ?(R)-consistency applies it within a region (cluster). Facebook have been tracking this metric in production since 2012.
(tags: facebook eventual-consistency consistency metrics papers cap distributed-computing)
Holistic Configuration Management at Facebook
How FB push config changes from Git (where it is code reviewed, version controlled, and history tracked with strong auth) to Zeus (their Zookeeper fork) and from there to live production servers.
(tags: facebook configuration zookeeper git ops architecture)
-
a high-performance multiple regex matching library. Hyperscan uses hybrid automata techniques to allow simultaneous matching of large numbers (up to tens of thousands) of regular expressions and for the matching of regular expressions across streams of data.
Via Tony Finch(tags: via:fanf regexps regex dpi hyperscan dfa nfa hybrid-automata text-matching matching text strings streams)
-
Hologram exposes an imitation of the EC2 instance metadata service on developer workstations that supports the [IAM Roles] temporary credentials workflow. It is accessible via the same HTTP endpoint to calling SDKs, so your code can use the same process in both development and production. The keys that Hologram provisions are temporary, so EC2 access can be centrally controlled without direct administrative access to developer workstations.
(tags: iam roles ec2 authorization aws adroll open-source cli osx coding dev)
AWS re:Invent 2015 Video & Slide Presentation Links with Easy Index
Andrew Spyker's roundup:
my quick index of all re:Invent sessions. Please wait for a few days and I'll keep running the tool to fill in the index. It usually takes Amazon a few weeks to fully upload all the videos and slideshares.
Pretty definitive, full text descriptions of all sessions (and there are an awful lot of 'em).(tags: aws reinvent andrew-spyker scraping slides presentations ec2 video)
(ARC308) The Serverless Company: Using AWS Lambda
Describing PlayOn! Sports' Lambda setup. Sounds pretty productionizable
Your Relative's DNA Could Turn You Into A Suspect
Familial DNA searching has massive false positives, but is being used to tag suspects:
The bewildered Usry soon learned that he was a suspect in the 1996 murder of an Idaho Falls teenager named Angie Dodge. Though a man had been convicted of that crime after giving an iffy confession, his DNA didn’t match what was found at the crime scene. Detectives had focused on Usry after running a familial DNA search, a technique that allows investigators to identify suspects who don’t have DNA in a law enforcement database but whose close relatives have had their genetic profiles cataloged. In Usry’s case the crime scene DNA bore numerous similarities to that of Usry’s father, who years earlier had donated a DNA sample to a genealogy project through his Mormon church in Mississippi. That project’s database was later purchased by Ancestry, which made it publicly searchable—a decision that didn’t take into account the possibility that cops might someday use it to hunt for genetic leads. Usry, whose story was first reported in The New Orleans Advocate, was finally cleared after a nerve-racking 33-day wait — the DNA extracted from his cheek cells didn’t match that of Dodge’s killer, whom detectives still seek. But the fact that he fell under suspicion in the first place is the latest sign that it’s time to set ground rules for familial DNA searching, before misuse of the imperfect technology starts ruining lives.
(tags: dna familial-dna false-positives law crime idaho murder mormon genealogy ancestry.com databases biometrics privacy genes)
Cluster benchmark: Scylla vs Cassandra
ScyllaDB (the C* clone in C++) is now actually looking promising -- still need more reassurance about its consistency/reliabilty side though
_What We Know About Spreadsheet Errors_ [paper]
As we will see below, there has long been ample evidence that errors in spreadsheets are pandemic. Spreadsheets, even after careful development, contain errors in one percent or more of all formula cells. In large spreadsheets with thousands of formulas, there will be dozens of undetected errors. Even significant errors may go undetected because formal testing in spreadsheet development is rare and because even serious errors may not be apparent.
(tags: business coding maths excel spreadsheets errors formulas error-rate)
-
great post from Ross Duggan on avoiding developer burnout
(tags: coding burnout productivity work)
How is NSA breaking so much crypto?
If a client and server are speaking Diffie-Hellman, they first need to agree on a large prime number with a particular form. There seemed to be no reason why everyone couldn’t just use the same prime, and, in fact, many applications tend to use standardized or hard-coded primes. But there was a very important detail that got lost in translation between the mathematicians and the practitioners: an adversary can perform a single enormous computation to “crack” a particular prime, then easily break any individual connection that uses that prime. How enormous a computation, you ask? Possibly a technical feat on a scale (relative to the state of computing at the time) not seen since the Enigma cryptanalysis during World War II. Even estimating the difficulty is tricky, due to the complexity of the algorithm involved, but our paper gives some conservative estimates. For the most common strength of Diffie-Hellman (1024 bits), it would cost a few hundred million dollars to build a machine, based on special purpose hardware, that would be able to crack one Diffie-Hellman prime every year. Would this be worth it for an intelligence agency? Since a handful of primes are so widely reused, the payoff, in terms of connections they could decrypt, would be enormous. Breaking a single, common 1024-bit prime would allow NSA to passively decrypt connections to two-thirds of VPNs and a quarter of all SSH servers globally. Breaking a second 1024-bit prime would allow passive eavesdropping on connections to nearly 20% of the top million HTTPS websites. In other words, a one-time investment in massive computation would make it possible to eavesdrop on trillions of encrypted connections.
(via Eric)
AWS re:Invent 2015 | (CMP406) Amazon ECS at Coursera - YouTube
Coursera are running user-submitted code in ECS! interesting stuff about how they use Docker security/resource-limiting features, forking the ecs-agent code, to run user-submitted code. :O
(tags: coursera user-submitted-code sandboxing docker security ecs aws resource-limits ops)
How both TCP and Ethernet checksums fail
At Twitter, a team had a unusual failure where corrupt data ended up in memcache. The root cause appears to have been a switch that was corrupting packets. Most packets were being dropped and the throughput was much lower than normal, but some were still making it through. The hypothesis is that occasionally the corrupt packets had valid TCP and Ethernet checksums. One "lucky" packet stored corrupt data in memcache. Even after the switch was replaced, the errors continued until the cache was cleared.
YA occurrence of this bug. When it happens, it tends to _really_ screw things up, because it's so rare -- we had monitoring for this in Amazon, and when it occurred, it overwhelmingly occurred due to host-level kernel/libc/RAM issues rather than stuff in the network. Amazon design principles were to add app-level checksumming throughout, which of course catches the lot.(tags: networking tcp ip twitter ethernet checksums packets memcached)
Designing the Spotify perimeter
How Spotify use nginx as a frontline for their sites and services
(tags: scaling spotify nginx ops architecture ssl tls http frontline security)
-
Supports Spotify -- totally getting one of these
Where do 'mama'/'papa' words come from?
The sounds came first — as experiments in vocalization — and parents adopted them as pet names for themselves. If you open your mouth and make a sound, it will probably be an open vowel like /a/ unless you move your tongue or lips. The easiest consonants are perhaps the bilabials /m/, /p/, and /b/, requiring no movement of the tongue, followed by consonants made by raising the front of the tongue: /d/, /t/, and /n/. Add a dash of reduplication, and you get mama, papa, baba, dada, tata, nana. That such words refer to people (typically parents or other guardians) is something we have imposed on the sounds and incorporated into our languages and cultures; the meanings don’t inhere in the sounds as uttered by babies, which are more likely calls for food or attention.
(tags: sounds voice speech babies kids phonetics linguist language)
-
'A fast build system for Docker images', open source, in Go, hooks into Github
England opens up 11TB of LiDAR data covering the entire country as open data
All 11 terabytes of our LIDAR data (that’s roughly equivalent to 2,750,000 MP3 songs) will eventually be available through our new Open LIDAR portal under an Open Government Licence, allowing it to be used for any purpose. We hope that by giving free access to our data businesses and local communities will develop innovative solutions to benefit the environment, grow our thriving rural economy, and boost our world-leading food and farming industry. The possibilities are endless and we hope that making LIDAR data open will be a catalyst for new ideas and innovation.
Are you reading, Ordnance Survey Ireland?
SuperChief: From Apache Storm to In-House Distributed Stream Processing
Another sorry tale of Storm issues:
Storm has been successful at Librato, but we experienced many of the limitations cited in the Twitter Heron: Stream Processing at Scale paper and outlined here by Adrian Colyer, including: Inability to isolate, reason about, or debug performance issues due to the worker/executor/task paradigm. This led to building and configuring clusters specifically designed to attempt to mitigate these problems (i.e., separate clusters per topology, only running a worker per server.), which added additional complexity to development and operations and also led to over-provisioning. Ability of tasks to move around led to difficult to trace performance problems. Storm’s work provisioning logic led to some tasks serving more Kafka partitions than others. This in turn created latency and performance issues that were difficult to reason about. The initial solution was to over-provision in an attempt to get a better hashing/balancing of work, but eventually we just replaced the work allocation logic. Due to Storm’s architecture, it was very difficult to get a stack trace or heap dump because the processes that managed workers (Storm supervisor) would often forcefully kill a Java process while it was being investigated in this way. The propensity for unexpected and subsequently unhandled exceptions to take down an entire worker led to additional defensive verbose error handling everywhere. This nasty bug STORM-404 coupled with the aforementioned fact that a single exception can take down a worker led to several cascading failures in production, taking down entire topologies until we upgraded to 0.9.4. Additionally, we found the performance we were getting from Storm for the amount of money we were spending on infrastructure was not in line with our expectations. Much of this is due to the fact that, depending upon how your topology is designed, a single tuple may make multiple hops across JVMs, and this is very expensive. For example, in our time series aggregation topologies a single tuple may be serialized/deserialized and shipped across the wire 3-4 times as it progresses through the processing pipeline.
(tags: scalability storm kafka librato architecture heron ops)
-
Librato's service discovery library using Zookeeper (so strongly consistent, but with the ZK downside that an AZ outage can stall service discovery updates region-wide)
(tags: zookeeper service-discovery librato java open-source load-balancing)
Tech companies like Facebook not above the law, says Max Schrems
“Big companies didn’t only rely on safe harbour: they also rely on binding corporate rules and standard contractual clauses. But it’s interesting that the court decided the case on fundamental rights grounds: so it doesn’t matter remotely what ground you transfer on, if that process is still illegal under 7 and 8 of charter, it can’t be done.”
Also:“Ireland has no interest in doing its job, and will continue not to, forever. Clearly it’s an investment issue – but overall the policy is: we don’t regulate companies here. The cost of challenging any of this in the courts is prohibitive. And the people don’t seem to care.”
:((tags: ireland guardian max-schrems privacy surveillance safe-harbor eu us nsa dpc data-protection)
After Bara: All your (Data)base are belong to us
Sounds like the CJEU's Bara decision may cause problems for the Irish government's wilful data-sharing:
Articles 10, 11 and 13 of Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995, on the protection of individuals with regard to the processing of personal data and on the free movement of such data, must be interpreted as precluding national measures, such as those at issue in the main proceedings, which allow a public administrative body of a Member State to transfer personal data to another public administrative body and their subsequent processing, without the data subjects having been informed of that transfer or processing.
(tags: data databases bara cjeu eu law privacy data-protection)
-
uses the techniques invented by the authors of Paris-traceroute to enumerate the paths of ECMP flow-based load balancing, but introduces a new technique for NAT detection.
handy. written by AWS SDE Andrea Barberio!(tags: internet tracing traceroute networking ecmp nat ip)
-
'Seekable and Splittable Gzip', from eBay
(tags: ebay gzip compression seeking streams splitting logs gzinga)
Outage postmortem (2015-10-08 UTC) : Stripe: Help & Support
There was a breakdown in communication between the developer who requested the index migration and the database operator who deleted the old index. Instead of working on the migration together, they communicated in an implicit way through flawed tooling. The dashboard that surfaced the migration request was missing important context: the reason for the requested deletion, the dependency on another index’s creation, and the criticality of the index for API traffic. Indeed, the database operator didn’t have a way to check whether the index had recently been used for a query.
Good demo of how the Etsy-style chatops deployment approach would have helped avoid this risk.(tags: stripe postmortem outages databases indexes deployment chatops deploy ops)
-
Wendy Grossman on where the Safe Harbor decision is leading.
One clause would require European companies to tell their relevant data protection authorities if they are being compelled to turn over data - even if they have been forbidden to disclose this under US law. Sounds nice, but doesn't mobilize the rock or soften the hard place, since companies will still have to pick a law to violate. I imagine the internal discussions there revolving around two questions: which violation is less likely to land the CEO in jail and which set of fines can we afford?
(via Simon McGarr)(tags: safe-harbor privacy law us eu surveillance wendy-grossman via:tupp_ed)
-
bookmarking as a potential future addition to the back garden
Rebuilding Our Infrastructure with Docker, ECS, and Terraform
Good writeup of current best practices for a production AWS architecture
(tags: aws ops docker ecs ec2 prod terraform segment via:marc)
The Totally Managed Analytics Pipeline: Segment, Lambda, and Dynamo
notable mainly for the details of Terraform support for Lambda: that's a significant improvement to Lambda's production-readiness
(tags: aws pipelines data streaming lambda dynamodb analytics terraform ops)
Gene patents probably dead worldwide following Australian court decision
The court based its reasoning on the fact that, although an isolated gene such as BRCA1 was "a product of human action, it was the existence of the information stored in the relevant sequences that was an essential element of the invention as claimed." Since the information stored in the DNA as a sequence of nucleotides was a product of nature, it did not require human action to bring it into existence, and therefore could not be patented.
Via Tony Finch.(tags: via:fanf australia genetics law ipr medicine ip patents)
-
client-side 'service discovery and routing system for microservices' -- another Smartstack, then
(tags: python router smartstack baker-street microservices service-discovery routing load-balancing http)
-
ugh, quite a bit of complexity here
(tags: docker osx dev ops building coding ifttt dns dnsmasq)
Fuzzing Raft for Fun and Publication
Good intro to fuzz-testing a distributed system; I've had great results using similar approaches in unit tests
EC2 Spot Blocks for Defined-Duration Workloads
you can now launch Spot instances that will run continuously for a finite duration (1 to 6 hours). Pricing is based on the requested duration and the available capacity, and is typically 30% to 45% less than On-Demand.
The Surveillance Elephant in the Room…
Very perceptive post on the next steps for safe harbor, post-Schrems.
And behind that elephant there are other elephants: if US surveillance and surveillance law is a problem, then what about UK surveillance? Is GCHQ any less intrusive than the NSA? It does not seem so – and this puts even more pressure on the current reviews of UK surveillance law taking place. If, as many predict, the forthcoming Investigatory Powers Bill will be even more intrusive and extensive than current UK surveillance laws this will put the UK in a position that could rapidly become untenable. If the UK decides to leave the EU, will that mean that the UK is not considered a safe place for European data? Right now that seems the only logical conclusion – but the ramifications for UK businesses could be huge. [....] What happens next, therefore, is hard to foresee. What cannot be done, however, is to ignore the elephant in the room. The issue of surveillance has to be taken on. The conflict between that surveillance and fundamental human rights is not a merely semantic one, or one for lawyers and academics, it’s a real one. In the words of historian and philosopher Quentin Skinner “the current situation seems to me untenable in a democratic society.” The conflict over Safe Harbor is in many ways just a symptom of that far bigger problem. The biggest elephant of all.
(tags: ec cjeu surveillance safe-harbor schrems privacy europe us uk gchq nsa)
ECJ ruling on Irish privacy case has huge significance
The only current way to comply with EU law, the judgment indicates, is to keep EU data within the EU. Whether those data can be safely managed within facilities run by US companies will not be determined until the US rules on an ongoing Microsoft case. Microsoft stands in contempt of court right now for refusing to hand over to US authorities, emails held in its Irish data centre. This case will surely go to the Supreme Court and will be an extremely important determination for the cloud business, and any company or individual using data centre storage. If Microsoft loses, US multinationals will be left scrambling to somehow, legally firewall off their EU-based data centres from US government reach.
(cough, Amazon)(tags: aws hosting eu privacy surveillance gchq nsa microsoft ireland)
-
"@alexbfree @ThijsFeryn [ElasticSearch is] fine as long as data loss is acceptable. https://aphyr.com/posts/317-call-me-maybe-elasticsearch . We lose ~1% of all writes on average."
(tags: elasticsearch data-loss reliability data search aphyr jepsen testing distributed-systems ops)
Daragh O'Brien on the CJEU judgement on Safe Harbor
Many organisations I've spoken to have had the cunning plan of adopting model contract clauses as their fall back position to replace their reliance on Safe Harbor. [....] The best that can be said for Model Clauses is that they haven't been struck down by the CJEU. Yet.
(tags: model-clauses cjeu eu europe safe-harbor us nsa surveillance privacy law)
5 takeaways from the death of safe harbor – POLITICO
Reacting to the ruling, the [EC] stressed that data transfers between the U.S. and Europe can continue on the basis of other legal mechanisms. A lot rides on what steps the Commission and national data protection supervisors take in response. “It is crucial for legal certainty that the EC sends a clear signal,” said Nauwelaerts. That could involve providing a timeline for concluding an agreement with U.S. authorities, together with a commitment from national data protection authorities not to block data transfers while negotiations are on-going, he explained.
The New InfluxDB Storage Engine: A Time Structured Merge Tree
The new engine has similarities with LSM Trees (like LevelDB and Cassandra’s underlying storage). It has a write ahead log, index files that are read only, and it occasionally performs compactions to combine index files. We’re calling it a Time Structured Merge Tree because the index files keep contiguous blocks of time and the compactions merge those blocks into larger blocks of time. Compression of the data improves as the index files are compacted. Once a shard becomes cold for writes it will be compacted into as few files as possible, which yield the best compression.
(tags: influxdb storage lsm-trees leveldb tsm-trees data-structures algorithms time-series tsd compression)
Marvin.ie: Order Takeaway Food Online
new Dublin delivery service takes Bitcoin?!
(tags: bitcoin food delivery takeaway payment ireland dublin wtf)
qp tries: smaller and faster than crit-bit tries
interesting new data structure from Tony Finch. "Some simple benchmarks say qp tries have about 1/3 less memory overhead and are about 10% faster than crit-bit tries."
(tags: crit-bit popcount bits bitmaps tries data-structures via:fanf qp-tries crit-bit-tries hacks memory)
Schneier on Automatic Face Recognition and Surveillance
When we talk about surveillance, we tend to concentrate on the problems of data collection: CCTV cameras, tagged photos, purchasing habits, our writings on sites like Facebook and Twitter. We think much less about data analysis. But effective and pervasive surveillance is just as much about analysis. It's sustained by a combination of cheap and ubiquitous cameras, tagged photo databases, commercial databases of our actions that reveal our habits and personalities, and -- most of all -- fast and accurate face recognition software. Don't expect to have access to this technology for yourself anytime soon. This is not facial recognition for all. It's just for those who can either demand or pay for access to the required technologies -- most importantly, the tagged photo databases. And while we can easily imagine how this might be misused in a totalitarian country, there are dangers in free societies as well. Without meaningful regulation, we're moving into a world where governments and corporations will be able to identify people both in real time and backwards in time, remotely and in secret, without consent or recourse. Despite protests from industry, we need to regulate this budding industry. We need limitations on how our images can be collected without our knowledge or consent, and on how they can be used. The technologies aren't going away, and we can't uninvent these capabilities. But we can ensure that they're used ethically and responsibly, and not just as a mechanism to increase police and corporate power over us.
(tags: privacy regulation surveillance bruce-schneier faces face-recognition machine-learning ai cctv photos)
-
China just introduced a universal credit score, where everybody is measured as a number between 350 and 950. But this credit score isn’t just affected by how well you manage credit – it also reflects how well your political opinions are in line with Chinese official opinions, and whether your friends’ are, too.
Measuring using online mass surveillance, naturally. This may be the most dystopian thing I've heard in a while....(tags: via:raycorrigan dystopia china privacy mass-surveillance politics credit credit-score loans opinions)
Brand New Retro – The Book, November 2015
YESSSS. Joe and Brian have delivered -- going to be giving a lot of copies of this for xmas ;)
(tags: brand-new-retro blogs friends retro history dublin ireland books toget)
-
your command line environment in the [Google] Cloud. This feature enables you to connect to a shell environment on a virtual machine, pre-loaded with the tools you need to easily run commands to develop, deploy and manage your projects. Currently, Cloud Shell is an f1-micro Google Compute Engine machine that exposes a Debian-based development environment. You are also assigned 5 GB of standard persistent disk space as the home disk so you can store files between sessions.
It's also free. This is a great idea -- handy both for beginners getting to grips with GoogCloud and for experts looking for a quite dev env to hack with. I wish AWS had something similar. Amaro: A Bittersweet Obsession - Food & Wine
"A Neapolitan-American friend of mine, who's in his mid-fifties, fondly remembers how his mother used to serve him an espresso with Fernet Branca and an egg yolk every morning before he went off to elementary school."
(tags: amari amaro bitters digestifs booze cocktails recipes)
-
come recommended by http://gearmoose.com/the-ten-best-minimalist-wallets-a-recap/ , looks pretty nice
(tags: wallets minimalism daily-carry pockets slimline gear toget)
Notes on Startup Engineering Management for Young Bloods
Below is a list of some lessons I’ve learned as an startup engineering manager that are worth being told to a new manager. Some are subtle, and some are surprising, and this being human beings, some are inevitably controversial. This list is for the new head of engineering to guide their thinking about the job they are taking on. It’s not comprehensive, but it’s a good beginning. The best characteristic of this list is that it focuses on social problems with little discussion of technical problems a manager may run into. The social stuff is usually the hardest part of any software developer’s job, and of course this goes triply for engineering managers.
(tags: engineering management camille-fournier teams dev)
Further reading on just culture and blameless post mortems
Some bookmarks around post-mortem activity
(tags: post-mortems culture etsy rafe-colburn rc3 john-allspaw ops coes)
Han Sung: Probably the Best Korean Food in Dublin
Han Sung is bizarrely located in the back of an Asian supermarket just off the Millennium Walk on Great Strand Street. [...] You’d see this a lot in Korea, I ask, a restaurant in the back of a supermarket? Not really, no, he says.
(tags: restaurants food eating dublin supermarkets korean nom)
Behold: The Ultimate Crowdsourced Map of Punny Businesses in America | Atlas Obscura
"Spex in the City", "Fidler on the Tooth", "Sight For Four Eyes", "Fried Egg I'm In Love", "Lice Knowing You" and many more
-
this is quite nice. PipelineDB allows direct hookup of a Kafka stream, and will ingest durably and reliably, and provide SQL views computed over a sliding window of the stream.
(tags: logging sql kafka pipelinedb streaming sliding-window databases search querying)
the impact of the economic crisis on public funding for universities in Europe
Ireland leading the pack with a drop of funding by 20% :(
(tags: universities ireland ucd tcd dcu funding public-funding europe history downturn)
CurrencyFair P2P International Money Transfers
recommended by Paul Hickey
(tags: via:phickey money money-transfer currency currency-conversion tools recommendations)
How the banks ignored the lessons of the crash
First of all, banks could be chopped up into units that can safely go bust – meaning they could never blackmail us again. Banks should not have multiple activities going on under one roof with inherent conflicts of interest. Banks should not be allowed to build, sell or own overly complex financial products – clients should be able to comprehend what they buy and investors understand the balance sheet. Finally, the penalty should land on the same head as the bonus, meaning nobody should have more reason to lie awake at night worrying over the risks to the bank’s capital or reputation than the bankers themselves. You might expect all major political parties to have come out by now with their vision of a stable and productive financial sector. But this is not what has happened.
(tags: banks banking guardian finance europe eu crash history)
The price of the Internet of Things will be a vague dread of a malicious world
So the fact is that our experience of the world will increasingly come to reflect our experience of our computers and of the internet itself (not surprisingly, as it’ll be infused with both). Just as any user feels their computer to be a fairly unpredictable device full of programs they’ve never installed doing unknown things to which they’ve never agreed to benefit companies they’ve never heard of, inefficiently at best and actively malignant at worst (but how would you now?), cars, street lights, and even buildings will behave in the same vaguely suspicious way. Is your self-driving car deliberately slowing down to give priority to the higher-priced models? Is your green A/C really less efficient with a thermostat from a different company, or it’s just not trying as hard? And your tv is supposed to only use its camera to follow your gestural commands, but it’s a bit suspicious how it always offers Disney downloads when your children are sitting in front of it. None of those things are likely to be legal, but they are going to be profitable, and, with objects working actively to hide them from the government, not to mention from you, they’ll be hard to catch.
(tags: culture bots criticism ieet iot internet-of-things law regulation open-source appliances)
excellent offline mapping app MAPS.ME goes open source
"MAPS.ME is an open source cross-platform offline maps application, built on top of crowd-sourced OpenStreetMap data. It was publicly released for iOS and Android."
(tags: maps.me mapping maps open-source apache ios android mobile)
Eircode cost the Irish government EUR38m
The C&AG has said it is not clear that the €38m scheme will achieve the data-matching benefits the Government had hoped.
Well, that's putting it mildly.(tags: eircode fail ireland costs money geo mapping geocoding)
Let a 1,000 flowers bloom. Then rip 999 of them out by the roots
The Twitter tech-debt story.
Somewhere along the way someone decided that it would be easier to convert the Birdcage to use Pants which had since learned how to build Scala and to deal with a maven-style layout. However at some point prior Pants been open sourced in throw it over the wall fashion and picked up by a few engineers at other companies, such as Square and Foursquare and moved forward. In the meantime, again because there weren’t enough people who’s job it was to take care of these things, Science was still on the original internally developed version and had in fact evolved independently of the open source version. However by the time we wanted to move Birdcage onto Pants, the open source version had moved ahead so that’s the one the Birdcage folks chose.
(cries)(tags: tech-debt management twitter productivity engineering monorepo build-systems war-stories dev)
-
Amazing. This is what happens when embedded software engineers make a UI, in my experience
(tags: embedded-software ui ux design graphics windows the-horror omgwtf atms)
EPA opposed rules that would have exposed VW's cheating
[...] Two months ago, the EPA opposed some proposed measures that would help potentially expose subversive code like the so-called “defeat device” software VW allegedly used by allowing consumers and researchers to legally reverse-engineer the code used in vehicles. EPA opposed this, ironically, because the agency felt that allowing people to examine the software code in vehicles would potentially allow car owners to alter the software in ways that would produce more emissions in violation of the Clean Air Act. The issue involves the 1998 Digital Millennium Copyright Act (DCMA), which prohibits anyone from working around “technological protection measures” that limit access to copyrighted works. The Library of Congress, which oversees copyrights, can issue exemptions to those prohibitions that would make it legal, for example, for researchers to examine the code to uncover security vulnerabilities.
(tags: dmca volkswagen vw law code open-source air-quality diesel cheating regulation us-politics)
From Radio to Porn, British Spies Track Web Users’ Online Identities
Inside KARMA POLICE, GCHQ's mass-surveillance operation aimed to record the browsing habits of "every visible user on the internet", including UK-to-UK internal traffic. more details on the other GCHQ mass surveillance projects at https://theintercept.com/gchq-appendix/
(tags: surveillance gchq security privacy law uk ireland karma-police snooping)
Streaming will soon pass traditional TV - Tech Insider
the percentage of people who say they stream video from services like Netflix, YouTube, and Hulu each day has increased dramatically over the last five years, from about 30% in 2010 to more than 50% this year. During the same period, the percentage of people who say they watch traditional TV [...] has dropped by about 10%. When the beige line surpasses the purple line [looks like 2016], it will mean that more people are streaming each day than are watching traditional TV.
Is there a CAP theorem for Durability?
Marc Brooker with another thought-provoking blogpost
(tags: databases storage marc-brooker cap-theorem cap durability pacelc nosql)
-
(via Aman)
(tags: via:akohli graphics ascii-art ascii visualization text boxes diagrams)
Scale it to Billions — What They Don’t Tell you in the Cassandra README
large-scale C* tips
(tags: cassandra configuration tuning scale ops)
Introduction to HDFS Erasure Coding in Apache Hadoop
How Hadoop did EC. Erasure Coding support ("HDFS-EC") is set to be released in Hadoop 3.0 apparently
(tags: erasure-coding reed-solomon algorithms hadoop hdfs cloudera raid storage)
-
some details on Netflix's Chaos Monkey, Chaos Kong and other aspects of their availability/failover testing
(tags: architecture aws netflix ops chaos-monkey chaos-kong testing availability failover ha)
-
Træf?k is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease. It supports several backends (Docker , Mesos/Marathon, Consul, Etcd, Rest API, file...) to manage its configuration automatically and dynamically.
Hot-reloading is notably much easier than with nginx/haproxy. -
a proxy that mucks with your system and application context, operating at Layers 4 and 7, allowing you to simulate common failure scenarios from the perspective of an application under test; such as an API or a web application. If you are building a distributed system, Muxy can help you test your resilience and fault tolerance patterns.
(tags: proxy distributed testing web http fault-tolerance failure injection tcp delay resilience error-handling)
Petabyte-Scale Data Pipelines with Docker, Luigi and Elastic Spot Instances — AdRoll
nice approach
(tags: data-pipelines docker luigi containers workflow)
-
a tool which simplifies tracing and testing of Java programs. Byteman allows you to insert extra Java code into your application, either as it is loaded during JVM startup or even after it has already started running. The injected code is allowed to access any of your data and call any application methods, including where they are private. You can inject code almost anywhere you want and there is no need to prepare the original source code in advance nor do you have to recompile, repackage or redeploy your application. In fact you can remove injected code and reinstall different code while the application continues to execute. The simplest use of Byteman is to install code which traces what your application is doing. This can be used for monitoring or debugging live deployments as well as for instrumenting code under test so that you can be sure it has operated correctly. By injecting code at very specific locations you can avoid the overheads which often arise when you switch on debug or product trace. Also, you decide what to trace when you run your application rather than when you write it so you don't need 100% hindsight to be able to obtain the information you need.
(tags: tracing java byteman injection jvm ops debugging testing)
Henry Robinson on testing and fault discovery in distributed systems
'Let's talk about finding bugs in distributed systems for a bit. These chaos monkey-style fault testing systems are all well and good, but by being application independent they're a very blunt instrument. Particularly they make it hard to search the fault space for bugs in a directed manner, because they don't 'know' what the system is doing. Application-aware scripting of faults in a dist. systems seems to be rarely used, but allows you to directly stress problem areas. For example, if a bug manifests itself only when one RPC returns after some timeout, hard to narrow that down with iptables manipulation. But allow a script to hook into RPC invocations (and other trace points, like DTrace's probes), and you can script very specific faults. That way you can simulate cross-system integration failures, *and* write reproducible tests for the bugs they expose! Anyhow, I've been doing this in Impala, and it's been very helpful. Haven't seen much evidence elsewhere.'
(tags: henry-robinson testing fault-discovery rpc dtrace tracing distributed-systems timeouts chaos-monkey impala)
The Best Bourbon Cocktail You’ve Never Heard Of
The "Paper Plane", by Sam Ross of Chicago's "Violet Hour": .75 oz Bourbon .75 oz Aperol .75 oz Amaro Nonino .75 oz Fresh lemon juice ice-filled shaker, shake, strain.
(tags: bourbon drinks cocktails recipes aperol amaro-nonino lemon)
-
C++ high-performance app framework; 'currently focused on high-throughput, low-latency I/O intensive applications.' Scylla (Cassandra-compatible NoSQL store) is written in this.
(tags: c++ opensource performance framework scylla seastar latency linux shared-nothing multicore)
How VW tricked the EPA's emissions testing system
In July 2015, CARB did some follow up testing and again the cars failed—the scrubber technology was present, but off most of the time. How this happened is pretty neat. Michigan’s Stefanopolou says computer sensors monitored the steering column. Under normal driving conditions, the column oscillates as the driver negotiates turns. But during emissions testing, the wheels of the car move, but the steering wheel doesn’t. That seems to have have been the signal for the “defeat device” to turn the catalytic scrubber up to full power, allowing the car to pass the test. Stefanopolou believes the emissions testing trick that VW used probably isn’t widespread in the automotive industry. Carmakers just don’t have many diesels on the road. And now that number may go down even more.
Depressing stuff -- but at least they think VW's fraud wasn't widespread.(tags: fraud volkswagen vw diesel emissions air-quality epa carb catalytic-converters testing)
EU court adviser: data-share deal with U.S. is invalid | Reuters
The Safe Harbor agreement does not do enough to protect EU citizen's private information when it reached the United States, Yves Bot, Advocate General at the European Court of Justice (ECJ), said. While his opinions are not binding, they tend to be followed by the court's judges, who are currently considering a complaint about the system in the wake of revelations from ex-National Security Agency contractor Edward Snowden of mass U.S. government surveillance.
(tags: safe-harbor law eu ec ecj snowden surveillance privacy us data max-schrems)
Summary of the Amazon DynamoDB Service Disruption and Related Impacts in the US-East Region
Painful to read, but: tl;dr: monitoring oversight, followed by a transient network glitch triggering IPC timeouts, which increased load due to lack of circuit breakers, creating a cascading failure
(tags: aws postmortem outages dynamodb ec2 post-mortems circuit-breakers monitoring)
What Happens Next Will Amaze You
Maciej Ceglowski's latest talk, on ads, the web, Silicon Valley and government:
'I went to school with Bill. He's a nice guy. But making him immortal is not going to make life better for anyone in my city. It will just exacerbate the rent crisis.'
(tags: talks slides funny ads advertising internet web privacy surveillance maciej silicon-valley)
Frame of Reference and Roaring Bitmaps
interesting performance-oriented algorithm tweak from Elastic/Lucene
(tags: lucene elasticsearch performance optimization roaring-bitmaps bitmaps frame-of-reference integers algorithms)
Uber Goes Unconventional: Using Driver Phones as a Backup Datacenter - High Scalability
Initially I thought they were just tracking client state on the phone, but it actually sounds like they're replicating other users' state, too. Mad stuff! Must cost a fortune in additional data transfer costs...
(tags: scalability failover multi-dc uber replication state crdts)
Brotli: a new compression algorithm for the internet from Google
While Zopfli is Deflate-compatible, Brotli is a whole new data format. This new format allows us to get 20–26% higher compression ratios over Zopfli. In our study ‘Comparison of Brotli, Deflate, Zopfli, LZMA, LZHAM and Bzip2 Compression Algorithms’ we show that Brotli is roughly as fast as zlib’s Deflate implementation. At the same time, it compresses slightly more densely than LZMA and bzip2 on the Canterbury corpus. The higher data density is achieved by a 2nd order context modeling, re-use of entropy codes, larger memory window of past data and joint distribution codes. Just like Zopfli, the new algorithm is named after Swiss bakery products. Brötli means ‘small bread’ in Swiss German.
(tags: brotli zopfli deflate gzip compression algorithms swiss google)
-
'The key thing about Ubiquiti gear is the high quality radios and antennas. It just seems much more reliable than most consumer WiFi gear. Their airOS firmware is good too, it’s a bit complicated to set up but very capable and flexible. And in addition to normal 802.11n or 802.11ac they also have an optional proprietary TDMA protocol called airMax that’s designed for serving several long haul links from a single basestation. They’re mostly marketing to business customers but the equipment is sold retail and well documented for ordinary nerds to figure out.'
(tags: ubiquiti wifi wireless 802.11 via:nelson ethernet networking prosumer hardware wan)
-
a specialized packet sniffer designed for displaying and logging HTTP traffic. It is not intended to perform analysis itself, but to capture, parse, and log the traffic for later analysis. It can be run in real-time displaying the traffic as it is parsed, or as a daemon process that logs to an output file. It is written to be as lightweight and flexible as possible, so that it can be easily adaptable to different applications.
via Eoin Brazil(tags: via:eoinbrazil httpry http networking tools ops testing tcpdump tracing)
ustwo Reimagines the In-Car Cluster
Designers behind the cult mobile game, Monument Valley, take on the legacy-bound in-car UI
(tags: ux ui cars driving safety ustwo monument-valley speed)
-
'It's very easy: So long as you don't hear "The Little Drummer Boy," you're a contender. As soon as you hear it on the radio, on TV, in a store, wherever, you're out.'
Geographically-accurate version of the London underground map
as Boing Boing says: 'London's subway system switched early to an abstract map (PDF), and it became a legendary work of design. It just published an internally-used geographic version of map (PDF), however, for the first time in a century—and it's awesome.'
(tags: london maps mapping geography accuracy pdf subway underground)
Critiki's top 10 tiki bars in the world
not a one in Europe, of course! I need to hit up one of these sometime
(tags: tiki bars drinks polynesian midcentury trader-vic critiki)
What is the fastest way to clone a git repository over a fast network connection? - Stack Overflow
"git bundle create" -- neat trick
(tags: git distribution copying git-bundle cli)
-
a regex-based, Turing-complete programming language. It's main feature is taking some text via standard input and repeatedly applying regex operations to it (e.g. matching, splitting, and most of all replacing). Under the hood, it uses .NET's regex engine, which means that both the .NET flavour and the ECMAScript flavour are available.
Reminscent of sed(1); see http://codegolf.stackexchange.com/a/58166 for an example Retina program(tags: retina regexps regexes regular-expressions coding hacks dot-net languages)
Time on multi-core, multi-socket servers
Nice update on the state of System.currentTimeMillis() and System.nanoTime() in javaland. Bottom line: both are non-monotonic nowadays:
The conclusion I've reached is that except for the special case of using nanoTime() in micro benchmarks, you may as well stick to currentTimeMillis() —knowing that it may sporadically jump forwards or backwards. Because if you switched to nanoTime(), you don't get any monotonicity guarantees, it doesn't relate to human time any more —and may be more likely to lead you into writing code which assumes a fast call with consistent, monotonic results.
(tags: java time monotonic sequencing nanotime timers jvm multicore distributed-computing)
Anatomy of a Modern Production Stack
Interesting post, but I think it falls into a common trap for the xoogler or ex-Amazonian -- assuming that all the BigCo mod cons are required to operate, when some are luxuries than can be skipped for a few years to get some real products built
(tags: architecture ops stack docker containerization deployment containers rkt coreos prod monitoring xooglers)
How We Use AWS Lambda for Rapidly Intensifying Workloads · CloudSploit
impressive -- pretty much the entire workload is run from Lambda here
(tags: lambda aws ec2 autoscaling cloudsploit)
Introducing the Software Testing Cupcake (Anti-Pattern)
good post on the risks of overweighting towards manual testing rather than low-level automated tests (via Tony Byrne)
(tags: qa testing via:tonyjbyrne tests antipatterns dev)
Kate Heddleston: How Our Engineering Environments Are Killing Diversity
'[There are] several problem areas for [diversity in] engineering environments and ways to start fixing them. The problems we face aren't devoid of solutions; there are a lot of things that companies, teams, and individuals can do to fix problems in their work environment. For the month of March, I will be posting detailed articles about the problem areas I will cover in my talk: argument cultures, feedback, promotions, employee on-boarding, benefits, safety, engineering process, and environment adaptation.' via Baron Schwartz.
(tags: via:xaprb culture tech diversity sexism feminism engineering work workplaces feedback)
-
'Heavily tinted blue paintings form space stations, spacesuits, and rockets just after blast. Michael Kagan paints these large-scale works to celebrate the man-made object—machinery that both protects and holds the possibility of instantly killing those that operate the equipment from the inside. To paint the large works, Kagan utilizes an impasto technique with thick strokes that are deliberate and unique, showing an aggression in his application of oil paint on linen. The New York-based artist focuses on iconic images in his practice, switching back and forth between abstract and representational styles. “The painting is finished when it can fall apart and come back together depending on how it is read and the closeness to the work,” said Kagan about his work. “Each painting is an image, a snapshot, a flash moment, a quick read that is locked into memory by the iconic silhouettes.”' Via http://www.thisiscolossal.com/2015/08/michael-kagens-space-paintings/
(tags: paintings prints art michael-kagan space abstract-art tobuy)
-
I’m assuming, if you are on the Internet and reading kind of a nerdy blog, that you know what Unicode is. At the very least, you have a very general understanding of it — maybe “it’s what gives us emoji”. That’s about as far as most people’s understanding extends, in my experience, even among programmers. And that’s a tragedy, because Unicode has a lot of… ah, depth to it. Not to say that Unicode is a terrible disaster — more that human language is a terrible disaster, and anything with the lofty goals of representing all of it is going to have some wrinkles. So here is a collection of curiosities I’ve encountered in dealing with Unicode that you generally only find out about through experience. Enjoy.
(tags: unicode characters encoding emoji utf-8 utf-16 utf mysql text)
httpbin(1): HTTP Client Testing Service
Testing an HTTP Library can become difficult sometimes. RequestBin is fantastic for testing POST requests, but doesn't let you control the response. This exists to cover all kinds of HTTP scenarios. Additional endpoints are being considered.
-
amazing slideshow/WebGL demo talking about graphics programming, its maths, and GPUs
(tags: maths graphics webgl demos coding algorithms slides tflops gpus)
‘I wish to register a complaint’: know your consumer rights before the fight
Conor Pope on the basics of consumer law -- and how to complain -- in Ireland
(tags: consumer ireland irish-times articles law)
-
an object pooling library for Java. Use it to recycle objects that are expensive to create. The library will take care of creating and destroying your objects in the background. Stormpot is very mature, is used in production, and has done over a trillion claim-release cycles in testing. It is faster and scales better than any competing pool.
Apache-licensed, and extremely fast: https://medium.com/@chrisvest/released-stormpot-2-4-eeab4aec86d0(tags: java stormpot object-pooling object-pools pools allocation gc open-source apache performance)
Evolution of Babbel’s data pipeline on AWS: from SQS to Kinesis
Good "here's how we found it" blog post:
Our new data pipeline with Kinesis in place allows us to plug new consumers without causing any damage to the current system, so it’s possible to rewrite all Queue Workers one by one and replace them with Kinesis Workers. In general, the transition to Kinesis was smooth and there were not so tricky parts. Another outcome was significantly reduced costs – handling almost the same amount of data as SQS, Kinesis appeared to be many times cheaper than SQS.
(tags: aws kinesis kafka streaming data-pipelines streams sqs queues architecture kcl)
You're probably wrong about caching
Excellent cut-out-and-keep guide to why you should add a caching layer. I've been following this practice for the past few years, after I realised that #6 (recovering from a failed cache is hard) is a killer -- I've seen a few large-scale outages where a production system had gained enough scale that it required a cache to operate, and once that cache was damaged, bringing the system back online required a painful rewarming protocol. Better to design for the non-cached case if possible.
(tags: architecture caching coding design caches ops production scalability)
The Alternative Universe Of Soviet Arcade Games
Unlike machines in the West, every single machine that was produced during Soviet-era Russia had to align with Marxist ideology. [...] The most popular games were created to teach hand-eye coordination, reaction speed, and logical, focused thinking. Not unlike many American games, these games were influenced by military training, crafted to teach and instill patriotism for the state by making the human body better, stronger, and more willful. It also means no high scores, no adrenaline rushes, or self-serving feather-fluffing as you add your hard-earned initials to the list of the best. In Communist Russia, there was no overt competition.
(tags: high-scores communism russia cccp ussr arcade-games games history)
Large Java HashMap performance overview
Large HashMap overview: JDK, FastUtil, Goldman Sachs, HPPC, Koloboke, Trove – January 2015 version
(tags: java performance hashmap hashmaps optimization fastutil hppc jdk koloboke trove data-structures)
-
Is it too late to replace Eircode?
Addresses are hard. Who can remember street addresses or latitude/longitude pairs? You could do much better with three totally random English words, but then there’s that pesky language barrier. No system is perfect, except for emoji.
(tags: eircode maps parody via:nelson location geocoding mapping pile-of-poo)
Real Time Analytics With Spark Streaming and Cassandra
...and Kafka
(tags: spark-streaming kafka analytics cassandra architecture data batch)
Improvements to Kafka integration of Spark Streaming
looks decent as an approach
(tags: kafka spark spark-streaming data)
Diffy: Testing services without writing tests
Play requests against 2 versions of a service. A fair bit more complex than simply replaying logged requests, which took 10 lines of a shell script last time I did it
(tags: http testing thrift automation twitter diffy diff soa tests)
Gmail supports animated emoji in e-mail subjects
Currently only used in spam, naturally. (via Hilary Mason)
-
The Algorithmist is a resource dedicated to anything algorithms - from the practical realm, to the theoretical realm. There are also links and explanation to problemsets.
A wiki for algorithms. Not sure if this is likely to improve on Wikipedia, which of course covers the same subject matter quite well, though(tags: algorithms reference wikis coding data-structures)
-
analyzes Spot price history to help you determine a bid price that suits your needs.
(tags: ec2 aws spot spot-instances history)
What Are the Worst Airports in the World?
this is a great resource when picking a stopover for a 2-stop flight. Pity "best kids play area" isn't a criterion
(tags: airports comparison via:boingboing flying travel ranking world skytrax)
Using Samsung's Internet-Enabled Refrigerator for Man-in-the-Middle Attacks
Whilst the fridge implements SSL, it FAILS to validate SSL certificates, thereby enabling man-in-the-middle attacks against most connections. This includes those made to Google's servers to download Gmail calendar information for the on-screen display. So, MITM the victim's fridge from next door, or on the road outside and you can potentially steal their Google credentials.
The Internet of Insecure Things strikes again.(tags: iot security fridges samsung fail mitm ssl tls google papers defcon)
Malware infecting jailbroken iPhones stole 225,000 Apple account logins | Ars Technica
KeyRaider, as the malware family has been dubbed, is distributed through a third-party repository of Cydia, which markets itself as an alternative to Apple's official App Store. Malicious code surreptitiously included with Cydia apps is creating problems for people in China and at least 17 other countries, including France, Russia, Japan, and the UK. Not only has it pilfered account data for 225,941 Apple accounts, it has also disabled some infected phones until users pay a ransom, and it has made unauthorized charges against some victims' accounts.
Ouch. Not a good sign for Cydia(tags: cydia apple security exploits jailbreaking ios iphone malware keyraider china)
-
'a simple command line tool that turns your CLI tools into web applications'
-
a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL effectively provides a hard disk of dynamic, infinite capacity that can be accessed from any computer with internet access running Linux, FreeBSD or OS-X. S3QL is a standard conforming, full featured UNIX file system that is conceptually indistinguishable from any local file system. Furthermore, S3QL has additional features like compression, encryption, data de-duplication, immutable trees and snapshotting which make it especially suitable for online backup and archival. S3QL is designed to favor simplicity and elegance over performance and feature-creep. Care has been taken to make the source code as readable and serviceable as possible. Solid error detection and error handling have been included from the very first line, and S3QL comes with extensive automated test cases for all its components.
(tags: filesystems aws s3 storage unix google-storage openstack)
3 Lessons From The Amazon Takedown - Fortune
They are: The leaders we admire aren’t always that admirable; Economic performance and costs trump employee well-being; and people participate in and rationalize their own subjugation. 'In the end, “Amazonians” are not that different from other people in their psychological dynamics. Their company is just a more extreme case of what many other organizations regularly do. And most importantly, let’s locate the problem, if there is one, and its solution where it most appropriately belongs—not with a CEO who is greatly admired (and wealthy beyond measure) running a highly admired company, but with a society where money trumps human well-being and where any price, maybe even lives, is paid for status and success.' (via Lean)
(tags: amazon work work-life-balance life us fortune via:ldoody ceos employment happiness)
What does it take to make Google work at scale? [slides]
50-slide summary of Google's stack, compared vs Facebook, Yahoo!, and open-source-land, with the odd interesting architectural insight
(tags: google architecture slides scalability bigtable spanner facebook gfs storage)
Scaling Analytics at Amplitude
Good blog post on Amplitude's lambda architecture setup, based on S3 and a custom "real-time set database" they wrote themselves. antirez' comment from a Redis angle on the set database: http://antirez.com/news/92 HN thread: https://news.ycombinator.com/item?id=10118413
(tags: lambda-architecture analytics via:hn redis set-storage storage databases architecture s3 realtime)
-
toxy is a fully programmatic and hackable HTTP proxy to simulate server failure scenarios and unexpected network conditions. It was mainly designed for fuzzing/evil testing purposes, when toxy becomes particularly useful to cover fault tolerance and resiliency capabilities of a system, especially in service-oriented architectures, where toxy may act as intermediate proxy among services. toxy allows you to plug in poisons, optionally filtered by rules, which essentially can intercept and alter the HTTP flow as you need, performing multiple evil actions in the middle of that process, such as limiting the bandwidth, delaying TCP packets, injecting network jitter latency or replying with a custom error or status code.
(tags: toxy proxies proxy http mitm node.js soa network failures latency slowdown jitter bandwidth tcp)
Drone Oversight Is Coming to Construction Sites
Grim Meathook Future
(tags: grim-meathook-future drones work panopticon future sacramento building-sites)
-
Open source security team has had enough of embedded-systems vendors taking the piss with licensing:
This announcement is our public statement that we've had enough. Companies in the embedded industry not playing by the same rules as every other company using our software violates users' rights, misleads users and developers, and harms our ability to continue our work. Though I've only gone into depth in this announcement on the latest trademark violation against us, our experience with two GPL violations over the previous year have caused an incredible amount of frustration. These concerns are echoed by the complaints of many others about the treatment of the GPL by the embedded Linux industry in particular over many years. With that in mind, today's announcement is concerned with the future availability of our stable series of patches. We decided that it is unfair to our sponsors that the above mentioned unlawful players can get away with their activity. Therefore, two weeks from now, we will cease the public dissemination of the stable series and will make it available to sponsors only. The test series, unfit in our view for production use, will however continue to be available to the public to avoid impact to the Gentoo Hardened and Arch Linux communities. If this does not resolve the issue, despite strong indications that it will have a large impact, we may need to resort to a policy similar to Red Hat's, described here or eventually stop the stable series entirely as it will be an unsustainable development model.
(tags: culture gpl linux opensource security grsecurity via:nelson gentoo arch-linux gnu)
London Calling: Two-Factor Authentication Phishing From Iran
some rather rudimentary anti-2FA attempts, presumably from Iranian security services
(tags: authentication phishing security iran activism 2fa mfa)
Vegemite May Power The Electronics Of The Future
Professor Marc in het Panhuis at the ARC Centre of Excellence for Electromaterials Science figured out that you can 3D print the paste and use it to carry current, effectively creating Vegemite bio-wires. What does this mean? Soon you can run electricity through your food. “The iconic Australian Vegemite is ideal for 3D printing edible electronics,” said the professor. “It contains water so it’s not a solid and can easily be extruded using a 3D printer. Also, it’s salty, so it conducts electricity.”
I'm sure the same applies for Marmite...(tags: vegemite marmite 3d-printing electronics bread food silly)
Beoir.org Community - Recent Attack on McGargles
bizarre conspiracy theory going around about McGargles microbrewery being owned by Molson in an "astroturf craft beer" operation -- they apparently were set up by a bunch of ex-Molson employees. Their beer is getting stickered in off-licenses. Mental!
(tags: beer craft-beer ireland mcgargles conspiracy-theories bizarre beoir)
Mining High-Speed Data Streams: The Hoeffding Tree Algorithm
This paper proposes a decision tree learner for data streams, the Hoeffding Tree algorithm, which comes with the guarantee that the learned decision tree is asymptotically nearly identical to that of a non-incremental learner using infinitely many examples. This work constitutes a significant step in developing methodology suitable for modern ‘big data’ challenges and has initiated a lot of follow-up research. The Hoeffding Tree algorithm has been covered in various textbooks and is available in several public domain tools, including the WEKA Data Mining platform.
(tags: hoeffding-tree algorithms data-structures streaming streams cep decision-trees ml learning papers)
Chinese scammers are now using Stingray tech to SMS-phish
A Stingray-style false GSM base station, hidden in a backpack; presumably they detect numbers in the vicinity, and SMS-spam those numbers with phishing messages. Reportedly the scammers used this trick in "Guangzhou, Zhuhai, Shenzhen, Changsha, Wuhan, Zhengzhou and other densely populated cities". Dodgy machine translation:
March 26, Zhengzhou police telecommunications fraud cases together, for the first time seized a small backpack can hide pseudo station equipment, and arrested two suspects. Yesterday, the police informed of this case, to remind the general public to pay attention to prevention. “I am the landlord, I changed number, please rent my wife hit the bank card, card number ×××, username ××.” Recently, Jiefang Road, Zhengzhou City Public Security Bureau police station received a number of cases for investigation brigade area of ??the masses police said, frequently received similar phone scam messages. Alarm, the police investigators to determine: the suspect may be in the vicinity of twenty-seven square, large-scale use of mobile pseudo-base release fraudulent information. [...] Yesterday afternoon, the Jiefang Road police station, the reporter saw the portable pseudo-base is made up of two batteries, a set-top box the size of the antenna box and a chassis, as well as a pocket computer composed together at most 5 kg.
(via t byfield and Danny O'Brien)(tags: via:mala via:tbyfield privacy scams phishing sms gsm stingray base-stations mobile china)
In search of performance - how we shaved 200ms off every POST request — GoCardless Blog
tl;dr: don't use Ruby's Net::HTTP and/or HAProxy prior to 1.4.19
(tags: http ruby tcp nagle performance rtt networking haproxy ack curl)
Non-Celiac Gluten Sensitivity May Not Exist
The data clearly indicated that a nocebo effect, the same reaction that prompts some people to get sick from wind turbines and wireless internet, was at work here. Patients reported gastrointestinal distress without any apparent physical cause. Gluten wasn't the culprit; the cause was likely psychological. Participants expected the diets to make them sick, and so they did.
(tags: gluten placebo nocebo food science health diet gluten-free fodmaps)
-
Some nice real-world experimentation around large-scale data processing in differential dataflow:
If you wanted to do an iterative graph computation like PageRank, it would literally be faster to sort the edges from scratch each and every iteration, than to use unsorted edges. If you want to do graph computation, please sort your edges. Actually, you know what: if you want to do any big data computation, please sort your records. Stop talking sass about how Hadoop sorts things it doesn't need to, read some papers, run some tests, and then sort your damned data. Or at least run faster than me when I sort your data for you.
(tags: algorithms graphs coding data-processing big-data differential-dataflow radix-sort sorting x-stream counting-sort pagerank)
Docker image creation, tagging and traceability in Shippable
this is starting to look quite impressive as a well-integrated Docker-meets-CI model; Shippable is basing its builds off Docker baselines and is automatically cutting Docker images of the post-CI stage. Must take another look