Why Disqus made the Python->Go switchover
for their realtime component, from the horse's mouth:
at higher contention, the CPU was choking everything. Switching over to Go removed that contention for us, which was the primary issue that we were seeing.
(tags: python languages concurrency go threading gevent scalability disqus realtime hn)
Database Migrations Done Right
The rule is simple. You should never tie database migrations to application deploys or vice versa. By minimising dependencies you enable faster, easier and cleaner deployments.
A solid description of why this is a good idea, from an ex-Guardian dev.(tags: migrations database sql mysql postgres deployment ops dependencies loose-coupling)
Category: Uncategorized
-
some cute brooches/jewellery here, for the next time I need to pick a nice gift
(tags: julie-moon art magic-pony jewellery brooches gifts)
Building a large scale CDN with Apache Traffic Server
via Ilya Grigorik: 'Great under-the-hood look at how Comcast built and operates their internal CDN for delivering video (on-demand + live). Some highlights: switched to own (open-source) stack; ~250 servers pushing ~1.5Pb of data/day with ~5Pb of storage capacity.'
(tags: cdn comcast video presentations apache traffic-server vod)
An analysis of Facebook photo caching
excellent analysis of caching behaviour at scale, from the FB engineering blog (via Tony Finch)
(tags: via:fanf caching facebook architecture photos images cache fifo lru scalability)
-
good advice. next time I go over, I'll have to get a Clipper card. Also: 'Brunch is its own section because I have never encountered a place that takes brunch so seriously.'
(tags: brunch sf travel california tips san-francisco clipper-card)
Alexey Shipilev on Java's System.nanoTime()
System.nanoTime is as bad as String.intern now: you can use it, but use it wisely. The latency, granularity, and scalability effects introduced by timers may and will affect your measurements if done without proper rigor. This is one of the many reasons why System.nanoTime should be abstracted from the users by benchmarking frameworks, monitoring tools, profilers, and other tools written by people who have time to track if the underlying platform is capable of doing what we want it to do. In some cases, there is no good solution to the problem at hand. Some things are not directly measurable. Some things are measurable with unpractical overheads. Internalize that fact, weep a little, and move on to building the indirect experiments. This is not the Wonderland, Alice. Understanding how the Universe works often needs side routes to explore. In all seriousness, we should be happy our $1000 hardware can measure 30 nanosecond intervals pretty reliably. This is roughly the time needed for the Internet packets originating from my home router to leave my apartment. What else do you want, you spoiled brats?
(tags: benchmarking jdk java measurement nanoseconds nsecs nanotime jvm alexey-shipilev jmh)
-
aka. "zero-shot learning". ok starting point
(tags: machine-learning zero-shot unsupervised algorithms ml)
-
Ilya Grigorik describes the design of the Bitcoin/altcoin block chain algorithm. Illuminating writeup
(tags: algorithms bitcoin security crypto blockchain ilya-grigorik)
-
The aim of the docker plugin is to be able to use a docker host to dynamically provision a slave, run a single build, then tear-down that slave. Optionally, the container can be committed, so that (for example) manual QA could be performed by the container being imported into a local docker provider, and run from there.
The holy grail of Jenkins/Docker integration. How cool is that...(tags: jenkins docker ops testing ec2 hosting scaling elastic-scaling system-testing)
-
an OSI layer 6 presentation for encoding/decoding messages in binary format to support low-latency applications. [...] SBE follows a number of design principles to achieve this goal. By adhering to these design principles sometimes means features available in other codecs will not being offered. For example, many codecs allow strings to be encoded at any field position in a message; SBE only allows variable length fields, such as strings, as fields grouped at the end of a message. The SBE reference implementation consists of a compiler that takes a message schema as input and then generates language specific stubs. The stubs are used to directly encode and decode messages from buffers. The SBE tool can also generate a binary representation of the schema that can be used for the on-the-fly decoding of messages in a dynamic environment, such as for a log viewer or network sniffer. The design principles drive the implementation of a codec that ensures messages are streamed through memory without backtracking, copying, or unnecessary allocation. Memory access patterns should not be underestimated in the design of a high-performance application. Low-latency systems in any language especially need to consider all allocation to avoid the resulting issues in reclamation. This applies for both managed runtime and native languages. SBE is totally allocation free in all three language implementations. The end result of applying these design principles is a codec that has ~25X greater throughput than Google Protocol Buffers (GPB) with very low and predictable latency. This has been observed in micro-benchmarks and real-world application use. A typical market data message can be encoded, or decoded, in ~25ns compared to ~1000ns for the same message with GPB on the same hardware. XML and FIX tag value messages are orders of magnitude slower again. The sweet spot for SBE is as a codec for structured data that is mostly fixed size fields which are numbers, bitsets, enums, and arrays. While it does work for strings and blobs, many my find some of the restrictions a usability issue. These users would be better off with another codec more suited to string encoding.
(tags: sbe encoding protobuf protocol-buffers json messages messaging binary formats low-latency martin-thompson xml)
Observations of an Internet Middleman
That leaves the remaining six [consumer ISPs peering with Level3] with congestion on almost all of the interconnect ports between us. Congestion that is permanent, has been in place for well over a year and where our peer refuses to augment capacity. They are deliberately harming the service they deliver to their paying customers. They are not allowing us to fulfil the requests their customers make for content. Five of those congested peers are in the United States and one is in Europe. There are none in any other part of the world. All six are large Broadband consumer networks with a dominant or exclusive market share in their local market. In countries or markets where consumers have multiple Broadband choices (like the UK) there are no congested peers.
Amazing that L3 are happy to publish this -- that's where big monopoly ISPs have led their industry.(tags: net-neutrality networking internet level3 congestion isps us-politics)
interview with Google VP of SRE Ben Treynor
interviewed by Niall Murphy, no less ;). Some good info on what Google deems important from an ops/SRE perspective
(tags: sre ops devops google monitoring interviews ben-treynor)
Faster BAM Sorting with SAMtools and RocksDB
Now this is really really clever. Heap-merging a heavyweight genomics format, using RocksDB to speed it up.
There’s a problem with the single-pass merge described above when the number of intermediate files, N/R, is large. Merging the sorted intermediate files in limited memory requires constantly reading little bits from all those files, incurring a lot of disk seeks on rotating drives. In fact, at some point, samtools sort performance becomes effectively bound to disk seeking. [...] In this scenario, samtools rocksort can sort the same data in much less time, using no more memory, by invoking RocksDB’s background compaction capabilities. With a few extra lines of code we configure RocksDB so that, while we’re still in the process of loading the BAM data, it runs additional background threads to merge batches of existing sorted temporary files into fewer, larger, sorted files. Just like the final merge, each background compaction requires only a modest amount of working memory.
(via the RocksDB facebook group)(tags: rocksdb algorithms sorting leveldb bam samtools merging heaps compaction)
Coding For Life (Battery Life, That Is)
great presentation on Android mobile battery life, and what to avoid
(tags: presentations via:sergio android mobile battery battery-life 3g wifi gprs hardware)
Oisin's mobile app release checklist
'This form is to document the testing that has been done on each app version before submitting to the App Store. For each item, indicate Yes if the testing has been done, Not Applicable if the testing does not apply (eg testing audio for an app that doesn’t play any), or No if the testing has not been done for another reason.'
(tags: apps checklists release coding ios android mobile ohurley)
"A New Data Structure For Cumulative Frequency Tables"
paper by Peter M Fenwick, 1993. 'A new method (the ‘binary indexed tree’) is presented for maintaining the cumulative frequencies which are needed to support dynamic arithmetic data compression. It is based on a decomposition of the cumulative frequencies into portions which parallel the binary representation of the index of the table element (or symbol). The operations to traverse the data structure are based on the binary coding of the index. In comparison with previous methods, the binary indexed tree is faster, using more compact data and simpler code. The access time for all operations is either constant or proportional to the logarithm of the table size. In conjunction with the compact data structure, this makes the new method particularly suitable for large symbol alphabets.' via Jakob Buchgraber, who's implementing it right now in Netty ;)
(tags: netty frequency-tables data-structures algorithms coding binary-tree indexing compression symbol-alphabets)
-
Patent trolls have sued or threatened to sue tens of thousands of end-users. For example, Innovatio attacked cafes, bakeries, and even a funeral parlor for using off-the-shelf Wi-Fi routers. And the notorious scanner troll, MPHJ, targeted small businesses and nonprofits around the country for using ordinary office equipment. As a recent paper explained: “Mass suits against technology customers have become too common, involving building block technologies like wi-fi, scanning, email and website technologies.” The growth in patent suits against customers reveals the importance of the Limelight case. A ruling that made it even easier to sue customers (by allowing suits against someone who performs just some steps of a patent) would encourage patent trolls to launch more abusive litigation campaigns. We hope the Supreme Court will restore the sensible rule that only a single entity (or its agents) can infringe a patent.
(tags: patents uspto swpats eff consumer law legal patent-infringement scanners wifi printers)
Hanging on the telephone – has anyone got it right on the new ban on text driving?
Some good legal commentary on this new Irish law.
There has been much hand-wringing and concern about whether or not the 2014 Regulations prohibit the use of Google Maps or Hailo, for example. They don’t, but this does not mean that drivers should feel free to use non-texting functions of their phones while driving – holding a mobile phone (which could include a tablet) while driving remains prohibited, whatever the use it is being put to. Moreover, offences of dangerous and careless driving and driving without due care and attention could cover a wide range of bad driving, and could include, for example, driving while zooming in and out of maps on your phone or sending stickers on WhatsApp.
(tags: ireland law driving safety mobile-phones texting google-maps satnav)
-
'better dates and times for Python', to fix the absurd proliferation of slightly-incompatible Python date/time types and APIs. unfortunately, http://imgs.xkcd.com/comics/standards.png applies....
(tags: python libraries time dates timestamps timezones apis proliferation iso-8601)
Holdings: Guinness's Brewery Dublin
'Guinness's Brewery Dublin. Malt House, malt on floor; sign' - One of the photos taken by my great-grandfather, Thomas H. Mason, around the turn of the century from the NLI collection
(tags: nli ireland photos t-h-mason history dublin guinness maltings beer)
Published image: 'An Irish Village'.
'Cart, man/woman; 2 men and boy serving beer outside, + sign 'Rich King Spirits'. Ragged attire' - One of the photos taken by my great-grandfather, Thomas H. Mason, around the turn of the century from the NLI collection
-
One of the photos taken by my great-grandfather, Thomas H. Mason, around the turn of the century from the NLI collection
-
One of the photos taken by my great-grandfather, Thomas H. Mason, around the turn of the century from the NLI collection
(tags: ireland history science chemistry crystals t-h-mason photos)
-
One of the SmartStack developers at AirBNB responds to Consul.io's comments. FWIW, we use SmartStack in Swrve and it works pretty well...
(tags: smartstack airbnb ops consul serf load-balancing availability resiliency network-partitions outages)
A Closer Look At OC's Anti-Vaccination Cluster
In communities such as San Clemente, Laguna Beach, Laguna Niguel, Aliso Viejo, Mission Viejo and Capistrano Beach, where Dr. Bob Sears practices, there are clusters of unvaccinated children. Last year, at 15 of the 40 elementary schools in the Capistrano Unified School District, more than 10 percent of kindergartners had [Personal Belief exemptions], according to data from the California Department of Public Health. At one public charter school, Journey, 56 percent of kindergartners were unvaccinated, at least partially, due to their parents' beliefs.
This is going to end horribly. Typical OC(tags: orange-county health vaccination laguna-beach oc dr-bob-sears kindergarten measles mumps rubella pertussis epidemiology)
-
Today we’re open sourcing Secor, a zero data loss log persistence service whose initial use case was to save logs produced by our monetization pipeline. Secor persists Kafka logs to long-term storage such as Amazon S3. It’s not affected by S3’s weak eventual consistency model, incurs no data loss, scales horizontally, and optionally partitions data based on date.
(tags: pinterest hadoop secor storm kafka architecture s3 logs archival)
Coping with the TCP TIME-WAIT state on busy Linux servers
extensive blog post
(tags: networking linux tcp performance time-wait sysctls tuning)
Flood IO Offering Network Emulation
Performance-testing-as-a-service company Flood.IO now offering emulation of various crappy end-user networks: GSM, DSL, GPRS, 3G, 4G etc. Great idea.
(tags: flood.io performance networking internet load-testing testing jmeter gatling tests gsm 3g mobile simulation)
-
Disqus' realtime architecture -- nginx PushStream module doing the heavy lifting, basically. See https://gist.github.com/dctrwatson/0b3b52050254e273ff11 for the production nginx configs they use. I am very impressed that push-stream has grown to be so solid; it's a great way to deal with push from the sounds of it. http://blog.disqus.com/post/51155103801/trying-out-this-go-thing now notes that some of the realtime backends are in Go. https://speakerdeck.com/dctrwatson/c1m-and-nginx ("C1M and Nginx") is a more up to date presentation. It notes that PushStream supports "EventSource, WebSocket, Long Polling, and forever iframe". More sysctls and nginx tuning in that prez.
(tags: sysctl nginx tuning go disqus realtime push eventsource websockets long-polling iframe python)
'The Design And Implementation Of Modern Column-Oriented Database Systems'
paper, PDF; Daniel Abadi et al.
(tags: papers pdf columnar-stores column-oriented databases storage architecture algorithms)
'Pickles & Spores: Improving Support for Distributed Programming in Scala
'Spores are "small units of possibly mobile functional behavior". They're a closure-like abstraction meant for use in distributed or concurrent environments. Spores provide a guarantee that the environment is effectively immutable, and safe to ship over the wire. Spores aim to give library authors some confidence in exposing functions (or, rather, spores) in public APIs for safe consumption in a distributed or concurrent environment. The first part of the talk covers a simpler variant of spores as they are proposed for inclusion in Scala 2.11. The second part of the talk briefly introduces a current research project ongoing at EPFL which leverages Scala's type system to provide type constraints that give authors finer-grained control over spore capturing semantics. What's more, these type constraints can be composed during spore composition, so library authors are effectively able to propagate expert knowledge via these composable constraints. The last part of the talk briefly covers Scala/Pickling, a fast new, open serialization framework.'
(tags: pickling scala presentations spores closures fp immutability coding distributed distcomp serialization formats network)
BBC News - Microsoft 'must release' data held on Dublin server
Messy. I can't see this lasting beyond an appeal.
Law enforcement efforts would be seriously impeded and the burden on the government would be substantial if they had to co-ordinate with foreign governments to obtain this sort of information from internet service providers such as Microsoft and Google, Judge Francis said. In a blog post, Microsoft's deputy general counsel, David Howard, said: "A US prosecutor cannot obtain a US warrant to search someone's home located in another country, just as another country's prosecutor cannot obtain a court order in her home country to conduct a search in the United States. "We think the same rules should apply in the online world, but the government disagrees."
(tags: microsoft regions law us-law privacy google cloud international-law surveillance)
Russia passes bill requiring bloggers to register with government
A bill passed by the Russian parliament on Tuesday says that any blogger read by at least 3,000 people a day has to register with the government telecom watchdog and follow the same rules as those imposed by Russian law on mass media. These include privacy safeguards, the obligation to check all facts, silent days before elections and loose but threatening injunctions against "abetting terrorism" and "extremism."
Russian blogging platforms have responded by changing view-counter tickers to display "2500+" as a max.(tags: russia blogs blogging terrorism extremism internet regulation chilling-effects censorship)
-
as used in the Apollo guidance computer systems -- hand-woven by "little old ladies". Amazing
(tags: core-memory memory rope-core guidance apollo space nasa history 1960s via:hn)
How I used Heartbleed to steal a site’s private crypto key
good writeup from Robin Xu
(tags: robin-xu heartbleed rsa private-keys openssl hacking security tls ssl)
Making runbooks more useful by exposing them through monitoring
Nice example of an ops runbook wiki for a service
(tags: runbooks ops devops monitoring sysadmin documentation wiki)
-
Ubuntu, C*, HAProxy, MySQL, RDS, multiple AWS regions.
(tags: hailo cassandra ubuntu mysql rds aws ec2 haproxy architecture)
They called it "big iron" for a reason: the Cray Motor-Generator Unit
I think the deal with the Motor-Generator Unit was that the Cray 1 needed not just enormous amounts of power (over a hundred kilowatts!), but also very stable power. So it ran from a huge electric generator connected directly to a huge electric motor, the motor running from dirty grid power and the generator, in turn, feeding the computer's own multi-voltage PSU. The Cray 1 itself weighed a mere 2.4 tonnes, but all this support stuff added several more tonnes.
via RobS.(tags: via:rob-synnott cray history big-iron motors power electricity generators)
Go: Best Practices for Production Environments
how Soundcloud deploy their Go services, after 2.5 years of Go in production
(tags: go tips deployment best-practices soundcloud ops)
-
'Location Codes for Irish Addresses'. Looks like, as expected, this will not have no-cost licensing terms; companies and non-profit orgs will all have to pay Capita Business Support Services Ireland for access. boo.
(tags: eircode mapping addressing geocoding ireland open-source licensing postcodes)
Why your company shouldn’t use Git submodules
'It is not uncommon at all when working on any kind of larger-scale project with Git to find yourself wanting to share code between multiple different repositories – whether it be some core system among multiple different products built on top of that system, or perhaps a shared utility library between projects. At first glance, Git submodules seem to be the perfect answer for this: they come built-in with Git, they act like miniature repositories (so people are already familiar with how to change them), et cetera. They even support pointing at specific versions of the shared code, so if one project doesn’t want to deal with integrating the “latest and greatest” version, it doesn’t have to. It’s after you’ve actually worked with submodules for a while that you start to notice just how half-baked Git’s submodules system really is.'
(tags: git source-control revision-control submodules storage)
Eyes Over Compton: How Police Spied on a Whole City
The law-enforcement pervasive-surveillance CCTV PVR.
In a secret test of mass surveillance technology, the Los Angeles County Sheriff's Department sent a civilian aircraft* over Compton, California, capturing high-resolution video of everything that happened inside that 10-square-mile municipality. Compton residents weren't told about the spying, which happened in 2012. "We literally watched all of Compton during the times that we were flying, so we could zoom in anywhere within the city of Compton and follow cars and see people," Ross McNutt of Persistence Surveillance Systems told the Center for Investigative Reporting, which unearthed and did the first reporting on this important story. The technology he's trying to sell to police departments all over America can stay aloft for up to six hours. Like Google Earth, it enables police to zoom in on certain areas. And like TiVo, it permits them to rewind, so that they can look back and see what happened anywhere they weren't watching in real time.
(via New Aesthetic)(tags: pvr cctv law-enforcement police compton los-angeles law surveillance future)
-
Blog post from the ES team. They use "evil tests" -- basically unit/system tests, particularly using randomized error-injecting mock infrastructure. Good practices; I've done the same myself quite recently for Swrve's realtime infrastructure
(tags: elasticsearch resiliency network-partitions reliability testing mocking error-injection)
Meet Ireland’s first bitcoin politician
Ossian Smyth -- Green Party internet spokesman and representative for communications, energy, and natural resources, with a top wheeze: “I think it is one of the most transparent ways of receiving donations. No one would know how much money can be donated into a bank account, but with bitcoin anyone can go to the block chain and look at the wallet." excellent ;)
(tags: ossian-smyth bitcoin fundraising greens politics ireland dublin green-party internet)
OpenPostCode demolishes the planned "Eircode" postcode system
Comprehensively ripped to shreds. Bottom line: 'Postcodes will be largely meaningless to anyone without access to the pay-walled database. It is another tax on business.'
(tags: postcodes ireland eircode addressing geocoding mapping maps open-data)
Netflix comes out strongly against Comcast
In sum, Comcast is not charging Netflix for transit service. It is charging Netflix for access to its subscribers. Comcast also charges its subscribers for access to Internet content providers like Netflix. In this way, Comcast is double dipping by getting both its subscribers and Internet content providers to pay for access to each other.
FIGHT!(tags: netflix comcast network-neutrality cartels competition us-politics business isps)
co-founder of the Boston Beer Company swears by active dry yeast as a hangover-avoidance remedy
what [Joe] Owades knew was that active dry yeast has an enzyme in it called alcohol dehydrogenases (ADH). Roughly put, ADH is able to break alcohol molecules down into their constituent parts of carbon, hydrogen, and oxygen. Which is the same thing that happens when your body metabolizes alcohol in its liver. Owades realized if you also have that enzyme in your stomach when the alcohol first hits it, the ADH will begin breaking it down before it gets into your bloodstream and, thus, your brain.
Plausible!(tags: beer science health yeast alcohol adh medicine enzymes stomach food)
-
At Comcast, our applications need convenient, low-latency access to important reference datasets. For example, our XfinityTV websites and apps need to use entertainment-related data to serve almost every API or web request to our datacenters: information like what year Casablanca was released, or how many episodes were in Season 7 of Seinfeld, or when the next episode of the Voice will be airing (and on which channel!). We traditionally managed this information with a combination of relational databases and RESTful web services but yearned for something simpler than the ORM, HTTP client, and cache management code our developers dealt with on a daily basis. As main memory sizes on commodity servers continued to grow, however, we asked ourselves: How can we keep this reference data entirely in RAM, while ensuring it gets updated as needed and is easily accessible to application developers? The Sirius distributed system library is our answer to that question, and we're happy to announce that we've made it available as an open source project. Sirius is written in Scala and uses the Akka actor system under the covers, but is easily usable by any JVM-based language.
Also includes a Paxos implementation with "fast follower" read-only slave replication. ASL2-licensed open source. The only thing I can spot to be worried about is speed of startup; they note that apps need to replay a log at startup to rebuild state, which can be slow if unoptimized in my experience. Update: in a twitter conversation at https://twitter.com/jon_moore/status/459363751893139456 , Jon Moore indicated they haven't had problems with this even with 'datasets consuming 10-20GB of heap', and have 'benchmarked a 5-node Sirius ingest cluster up to 1k updates/sec write throughput.' That's pretty solid!(tags: open-source comcast paxos replication read-only datastores storage memory memcached redis sirius scala akka jvm libraries)
AWS Elastic Beanstalk for Docker
This is pretty amazing. nice work, Beanstalk team. not sure how well it integrates with the rest of AWS though
(tags: aws amazon docker ec2 beanstalk ops containers linux)
TDD is dead. Long live testing
Oh god. I agree with DHH. shoot me now.
Test-first units leads to an overly complex web of intermediary objects and indirection in order to avoid doing anything that's "slow". Like hitting the database. Or file IO. Or going through the browser to test the whole system. It's given birth to some truly horrendous monstrosities of architecture. A dense jungle of service objects, command patterns, and worse. I rarely unit test in the traditional sense of the word, where all dependencies are mocked out, and thousands of tests can close in seconds. It just hasn't been a useful way of dealing with the testing of Rails applications. I test active record models directly, letting them hit the database, and through the use of fixtures. Then layered on top is currently a set of controller tests, but I'd much rather replace those with even higher level system tests through Capybara or similar. I think that's the direction we're heading. Less emphasis on unit tests, because we're no longer doing test-first as a design practice, and more emphasis on, yes, slow, system tests.
(tags: tdd rails testing unit-tests system-tests integration-testing ruby dhh mocks)
All at sea: global shipping fleet exposed to hacking threat | Reuters
Hackers recently shut down a floating oil rig by tilting it, while another rig was so riddled with computer malware that it took 19 days to make it seaworthy again; Somali pirates help choose their targets by viewing navigational data online, prompting ships to either turn off their navigational devices, or fake the data so it looks like they're somewhere else; and hackers infiltrated computers connected to the Belgian port of Antwerp, located specific containers, made off with their smuggled drugs and deleted the records.
(via Mikko Hypponen)(tags: via:mikko security hacking oilrigs shipping ships maritime antwerp piracy malware)
Search Results - (Author:Thomas H Mason)
Photographs taken by my great-grandfather, Thomas H. Mason, in the National Library of Ireland's newly-digitized online collection
(tags: family thomas-h-mason history ireland photography archive nli)
Syria's lethal Facebook checkpoints
An anonymous tip from a highly reliable source: "There are checkpoints in Syria where your Facebook is checked for affiliation with the rebellious groups or individuals aligned with the rebellion. People are then disappeared or killed if they are found to be connected. Drivers are literally forced to load their Facebook/Twitter accounts and then they are riffled through. It's happening daily, and has been for a year at least."
(tags: boing-boing war facebook social-media twitter internet checkpoints syria)
The ancient Egyptian word for 'cat' was pronounced 'miaow'
and many other cool-but-true factoids
(tags: fun quora interesting via:dorothy factoids true urban-legends facts)
-
kellabyte's hack in progress -- 'an asynchronous HTTP server framework written in C. The goal of Haywire is to learn how to create a server with a minimal feature set that can handle a high rate of requests and connections with as low of latency and resource usage as possible. Haywire uses the event loop based libuv platform layer that node.js is built on top of (also written in C). libuv abstracts IOCP on Windows and epoll/kqueue/event ports/etc. on Unix systems to provide efficient asynchronous I/O on all supported platforms.' Outperforms libevent handily, it seems. Apache-licensed.
(tags: server http asynchronous libuv haywire kellabyte c events open-source asl2)
spoofing the samsung smart tv internet check
If this kind of bullshit -- a HTTP GET of an XML file from www.samsung.com -- is how the Samsung Smart TV firmware decides if the internet is working or not, I dread to think how crappy the rest of the code is. (At least in Netnote we performed a bunch of bigco-domain DNS lookups before giving up...)
(tags: smart-tv samsung fail xml http internet embedded-software firmware crap-code)
ImperialViolet - No, don't enable revocation checking
...because it doesn't stop attacks. Turning it on does nothing but slow things down. You can tell when something is security theater because you need some absurdly specific situation in order for it to be useful.
(tags: cryptography crypto heartbleed ssl security tls https internet revocation crls)
-
*Really* intriguing slide deck on how Asia and Africa have invented new ways of operating a business via the internet, and are turning globalisation upside down (via Yoz)
(tags: via:yoz africa asia globalisation internet web mobile payment business ecommerce global)
Using AWS in the context of Australian Privacy Considerations
interesting new white paper from Amazon regarding recent strengthening of the Aussie privacy laws, particularly w.r.t. geographic location of data and access by overseas law enforcement agencies...
(tags: amazon aws security law privacy data-protection ec2 s3 nsa gchq five-eyes)
For world’s biggest troll, first patent case ends up in tatters
Love it. Intellectual Ventures suffers a major bloody nose in IV/Capital One patent-trolling litigation
(tags: trolls patent-trolls patents swpats capital-one intellectual-ventures)
Notes On Concurrent Ring Buffer Queue Mechanics
great notes from Nitsan Wakart, who's been hacking on ringbuffers a lot in JAQ
(tags: jaq nitsanw atomic concurrency data-structures ring-buffers queueing queues algorithms)
Uplink Latency of WiFi and 4G Networks
It's high. Wifi in particular shows high variability and long latency tails
-
vim-flake8 is a Vim plugin that runs the currently open file through Flake8, a static syntax and style checker for Python source code. It supersedes both vim-pyflakes and vim-pep8. Flake8 is a wrapper around PyFlakes (static syntax checker), PEP8 (style checker) and Ned's MacCabe script (complexity checker).
Recommended by several pythonistas of my acquaintance!(tags: vim python syntax error-checking errors flake8 editors ides coding)
-
Nice-looking new tool from Hashicorp; service discovery and configuration service, built on Raft for leader election, Serf for gossip-based messaging, and Go. Some features: * Gossip is performed over both TCP and UDP; * gossip messages are encrypted symmetrically and therefore secure from eavesdropping, tampering, spoofing and packet corruption (like the incident which brought down S3 for days: http://status.aws.amazon.com/s3-20080720.html ); * exposes both a HTTP interface and (even better) DNS; * includes explicit support for long-distance WAN operation as well as on LANs. It all looks very practical and usable. MPL-licensed. The only potential risk I can see is that expecting to receive config updates from a blocking poll of the HTTP interface needs some good "best practice" docs, to ensure that people don't mishandle the scenario where there is a network partition between your calling code and the Consul server/agent. Without any heartbeating protocol behind the scenes, HTTP is vulnerable to "hung connections" which would result in a config change being silently missed by the client until the connection eventually is timed out, either by the calling code or the client-side kernel. This could potentially take minutes to occur, which in some usage scenarios could be a big, unforeseen problem.
(tags: configuration service-discovery distcomp raft consensus-algorithms go mpl open-source dns http gossip-protocol hashicorp)
Druid | How We Scaled HyperLogLog: Three Real-World Optimizations
3 optimizations Druid.io have made to the HLL algorithm to scale it up for production use in Metamarkets: compacting registers (fixes a bug with unions of multiple HLLs); a sparse storage format (to optimize space); faster lookups using a lookup table.
(tags: druid.io metamarkets scaling hyperloglog hll algorithms performance optimization counting estimation)
HyperLogLog - Intersection Arithmetic
'In general HLL intersection in StreamLib works. |A INTERSECT B| = |A| + |B| - |A UNION B|. Timon's article on intersection is important to read though. The usefulness of HLL intersection depends on the features of the HLLs you are intersecting.'
(tags: hyperloglog hll hyperloglogplus streamlib intersections sets estimation algorithms)
Structural Integrity | 99% Invisible
'The student (who has since been lost to history) was studying Citicorp Center as part of his thesis and had found that the building was particularly vulnerable to quartering winds (winds that strike the building at its corners). Normally, buildings are strongest at their corners, and it’s the perpendicular winds (winds that strike the building at its face) that cause the greatest strain. But this was not a normal building. LeMessurier had accounted for the perpendicular winds, but not the quartering winds. He checked the math, and found that the student was right. He compared what velocity winds the building could withstand with weather data, and found that a storm strong enough to topple Citicorp Center hits New York City every 55 years. But that’s only if the tuned mass damper, which keeps the building stable, is running. LeMessurier realized that a major storm could cause a blackout and render the tuned mass damper inoperable. Without the tuned mass damper, LeMessurier calculated that a storm powerful enough to take out the building his New York every sixteen years.'
(tags: william-lemessurier architecture danger risk buildings nyc citicorp-center wind mass-dampers physics)
Linode announces new instance specs
'TL;DR: SSDs + Insane network + Faster processors + Double the RAM + Hourly Billing'
(tags: hosting linode ssd performance linux ops datacenters)
-
Fcron is a scheduler. It aims at replacing Vixie Cron, so it implements most of its functionalities. But contrary to Vixie Cron, fcron does not need your system to be up 7 days a week, 24 hours a day : it also works well with systems which are running only occasionnally (contrary to anacrontab). In other words, fcron does both the job of Vixie Cron and anacron, but does even more and better :)) ...
Thanks Craig!(tags: via:chughes cron fcron unix linux ops scheduler automation scripts)
-
They've done the classic website-redesign screwup -- omitted redirects from the old URLs.
Sam Silverwood-Cope, director of Intelligent Positioning, said: "They've ignored the legacy of the old Ryanair.com. It's quite startling. They are doing it just before their busiest time of the year." A change in [URLs] without proper redirects means many results found by Google now simply return error pages, he added. "Unless redirects get put in pretty soon, the position is going to get worse and worse."
(tags: ryanair inept fail funny via:christinebohan web google search redirects)
-
Scarfolk is a town in North West England that did not progress beyond 1979. Instead, the entire decade of the 1970s loops ad infinitum. Here in Scarfolk, pagan rituals blend seamlessly with science; hauntology is a compulsory subject at school, and everyone must be in bed by 8pm because they are perpetually running a slight fever. "Visit Scarfolk today. Our number one priority is keeping rabies at bay." For more information please reread.
(tags: scarfolk 1970s england history funny humour public-information pagan morbid)
-
OpenBSD are going wild ripping out "arcane VMS hacks" in an attempt to render OpenSSL's source code comprehensible, and finding amazing horrors like this: 'Well, even if time() isn't random, your RSA private key is probably pretty random. Do not feed RSA private key information to the random subsystem as entropy. It might be fed to a pluggable random subsystem…. What were they thinking?!'
(tags: random security openssl openbsd coding horror rsa private-keys entropy)
-
This is something Jenkins have come up to randomize and distribute load, in order to avoid the "thundering-herd" bug. Good call
(tags: jenkins randomization load-balancing load thundering-herd ops capacity sleep)
Shared Space and other bad junction designs lead to crashes and injuries
Just because something is "Dutch", that doesn't mean it's good. The Netherlands has many excellent examples, but you have to be very selective about what serves as a model. Cyclists fare best where their interactions with motor vehicles are limited and controlled. They fare best where infrastructure ensures that minor mistakes do not result in injuries. Anywhere that we rely upon everyone behaving perfectly but where we do not protect the most vulnerable, there will be injuries. Good design takes human nature into account and removes the causes of danger from those who are most vulnerable.
via Tony Finch(tags: cycling design junctions shared-space dutch holland roads safety crashes)
-
A sane Google Protocol Buffers library for Ruby. It's all about being Buf; ProtoBuf.
(tags: protobuf google protocol-buffers ruby coding libraries gems open-source)
-
When I said that we expected better of OpenSSL, it’s not merely that there’s some sense that security-driven code should be of higher quality. (OpenSSL is legendary for being considered a mess, internally.) It’s that the number of systems that depend on it, and then expose that dependency to the outside world, are considerable. This is security’s largest contributed dependency, but it’s not necessarily the software ecosystem’s largest dependency. Many, maybe even more systems depend on web servers like Apache, nginx, and IIS. We fear vulnerabilities significantly more in libz than libbz2 than libxz, because more servers will decompress untrusted gzip over bzip2 over xz. Vulnerabilities are not always in obvious places – people underestimate just how exposed things like libxml and libcurl and libjpeg are. And as HD Moore showed me some time ago, the embedded space is its own universe of pain, with 90’s bugs covering entire countries. If we accept that a software dependency becomes Critical Infrastructure at some level of economic dependency, the game becomes identifying those dependencies, and delivering direct technical and even financial support. What are the one million most important lines of code that are reachable by attackers, and least covered by defenders? (The browsers, for example, are very reachable by attackers but actually defended pretty zealously – FFMPEG public is not FFMPEG in Chrome.) Note that not all code, even in the same project, is equally exposed. It’s tempting to say it’s a needle in a haystack. But I promise you this: Anybody patches Linux/net/ipv4/tcp_input.c (which handles inbound network for Linux), a hundred alerts are fired and many of them are not to individuals anyone would call friendly. One guy, one night, patched OpenSSL. Not enough defenders noticed, and it took Neel Mehta to do something.
(tags: development openssl heartbleed ssl security dan-kaminsky infrastructure libraries open-source dependencies)
-
'a command line tool for Amazon's Simple Storage Service (S3). Written in Python, easy_install the package to install as an egg. Supports multithreaded operations for large volumes. Put, get, or delete many items concurrently, using a fixed-size pool of threads. Built on workerpool for multithreading and boto for access to the Amazon S3 API. Unix-friendly input and output. Pipe things in, out, and all around.' MIT-licensed open source. (via Paul Dolan)
(tags: via:pdolan s3 s3funnel tools ops aws python mit open-source)
-
The intuition behind Hydra is something like this, "I have a lot of data, and there are a lot of things I could try to learn about it -- so many that I'm not even sure what I want to know.” It's about the curse of dimensionality -- more dimensions means exponentially more cost for exhaustive analysis. Hydra tries to make it easy to reduce the number of dimensions, or the cost of watching them (via probabilistic data structures), to just the right point where everything runs quickly but can still answer almost any question you think you might care about.
Code: https://github.com/addthis/hydra Getting Started blog post: https://www.addthis.com/blog/2014/02/18/getting-started-with-hydra/(tags: hyrda hadoop data-processing big-data trees clusters analysis)
Stalled SCP and Hanging TCP Connections
a Cisco fail.
It looks like there’s a firewall in the middle that’s doing additional TCP sequence randomisation which was a good thing, but has been fixed in all current operating systems. Unfortunately, it seems that firewall doesn’t understand TCP SACK, which when coupled with a small amount of packet loss and a stateful host firewall that blocks invalid packets results in TCP connections that stall randomly. A little digging revealed that firewall to be the Cisco Firewall Services Module on our Canterbury network border.
(via Tony Finch)(tags: via:fanf cisco networking firewalls scp tcp hangs sack tcpdump)
Akamai's "Secure Heap" patch wasn't good enough
'Having the private keys inaccessible is a good defense in depth move. For this patch to work you have to make sure all sensitive values are stored in the secure area, not just check that the area looks inaccessible. You can't do that by keeping the private key in the same process. A review by a security engineer would have prevented a false sense of security. A version where the private key and the calculations are in a separate process would be more secure. If you decide to write that version, I'll gladly see if I can break that too.' Akamai's response: https://blogs.akamai.com/2014/04/heartbleed-update-v3.html -- to their credit, they recognise that they need to take further action. (via Tony Finch)
(tags: via:fanf cryptography openssl heartbleed akamai security ssl tls)
-
Colm MacCarthaigh writes about a simple sharding/load-balancing algorithm which uses randomized instance selection and optional additional compartmentalization. See also: continuous hashing, and http://aphyr.com/posts/278-timelike-2-everything-fails-all-the-time
(tags: hashing load-balancing sharding partitions dist-sys distcomp architecture coding)
Open Crypto Audit Project: TrueCrypt
phase I, a source code audit by iSEC Partners, is now complete. Bruce Schneier says: "I'm still using it".
(tags: encryption security crypto truecrypt audits source-code isec matthew-green)
-
In the PNAS paper, Brad Bushman and colleagues looked at 107 couples over 21 days and found that people experiencing uncharacteristically low blood sugar were more likely to display anger toward their spouse. (The researchers measured this by having subjects stick needles into voodoo dolls representing their significant others.)
(tags: hangry hunger food eating science health blood-sugar voodoo-dolls glucose)
insane ESB health and safety policy
Where it is not possible to avoid reversing, it is ESB policy that staff driving on behalf of the company or anybody on company premises should reverse into car spaces/bays, allowing them to drive out subsequently.
BUT WHYYYYYYYYYY(tags: esb health-n-safety policies crazy funny driving reversing lol safety)
Cloudflare demonstrate Heartbleed key extraction
from nginx. 'Based on the findings, we recommend everyone reissue + revoke their private keys.'
(tags: security nginx heartbleed ssl tls exploits private-keys)
When two-factor authentication is not enough
Fastmail.FM nearly had their domain stolen through an attack exploiting missing 2FA authentication in Gandi.
An important lesson learned is that just because a provider has a checkbox labelled “2 factor authentication” in their feature list, the two factors may not be protecting everything – and they may not even realise that fact themselves. Security risks always come on the unexpected paths – the “off label” uses that you didn’t think about, and the subtle interaction of multiple features which are useful and correct in isolation.
(tags: gandi 2fa fastmail authentication security mfa two-factor-authentication mail)
Of Money, Responsibility, and Pride
Steve Marquess of the OpenSSL Foundation on their funding, and lack thereof:
I stand in awe of their talent and dedication, that of Stephen Henson in particular. It takes nerves of steel to work for many years on hundreds of thousands of lines of very complex code, with every line of code you touch visible to the world, knowing that code is used by banks, firewalls, weapons systems, web sites, smart phones, industry, government, everywhere. Knowing that you’ll be ignored and unappreciated until something goes wrong. The combination of the personality to handle that kind of pressure with the relevant technical skills and experience to effectively work on such software is a rare commodity, and those who have it are likely to already be a valued, well-rewarded, and jealously guarded resource of some company or worthy cause. For those reasons OpenSSL will always be undermanned, but the present situation can and should be improved. There should be at least a half dozen full time OpenSSL team members, not just one, able to concentrate on the care and feeding of OpenSSL without having to hustle commercial work. If you’re a corporate or government decision maker in a position to do something about it, give it some thought. Please. I’m getting old and weary and I’d like to retire someday.
(tags: funding open-source openssl heartbleed internet security money)
-
a system for building agents that perform automated tasks for you online. They can read the web, watch for events, and take actions on your behalf. Huginn's Agents create and consume events, propagating them along a directed event flow graph. Think of it as Yahoo! Pipes plus IFTTT on your own server. You always know who has your data. You do.
MIT-licensed open source, built on Rails.(tags: ifttt automation huginn ruby rails open-source agents)
Why no SSL ? — Varnish version 4.0.0 documentation
Poul-Henning Kemp details why Varnish doesn't do SSL -- basically due to the quality and complexity of open-source SSL implementations:
There is no other way we can guarantee that secret krypto-bits do not leak anywhere they should not, than by fencing in the code that deals with them in a child process, so the bulk of varnish never gets anywhere near the certificates, not even during a core-dump.
Now looking pretty smart, post-Heartbleed.(tags: ssl tls varnish open-source poul-henning-kemp https http proxies security coding)
Basho LevelDB supports tiered storage
Tiered storage is turning out to be a pretty practical trick to take advantage of SSDs:
The justification for two types/speeds of storage arrays is simple. leveldb is extremely write intensive in its lower levels. The write intensity drops off as the level number increases. Similarly, current and frequently updated data tends to be in lower levels while archival data tends to be in higher levels. These leveldb characteristics create a desire to have faster, more expensive storage arrays for the high intensity lower levels. This branch allows the high intensity lower levels to be on expensive storage arrays while slower, less expensive storage arrays to hold the higher level data to reduce costs.
(tags: caching tiered-storage storage ssds ebs leveldb basho patches riak iops)
Forbes on the skeleton crew nature of OpenSSL
This is a great point:
Obviously, those tending to the security protocols that support the rest of the Web need better infrastructure and more funding. “Large portions of the software infrastructure of the Internet are built and maintained by volunteers, who get little reward when their code works well but are blamed, and sometimes savagely derided, when it fails,” writes Foster in the New Yorker. [...] "money and support still tend to flow to the newest and sexiest projects, while boring but essential elements like OpenSSL limp along as volunteer efforts,” he writes. “It’s easy to take open-source software for granted, and to forget that the Internet we use every day depends in part on the freely donated work of thousands of programmers.” We need to find ways to pay for work that is currently essentially donated freely. One promising project is Bithub, from Whisper Systems, where people who make valuable contributions to open source projects are rewarded (with Bitcoin of course). But the pool of Bitcoin is still donation based. The Internet has helped create a culture of free, but what we may need to recognize is that we get what we pay for. Well-funded companies pulling critical code from open source projects for their sites should have formal fee arrangements, rather than the volunteer group simply hoping these users will pony up some Benjamins for “prominent logo placement” on a website most people had never heard of before Heartbleed.
(tags: open-source openssl free sponsorship forbes via:karl-whelan)
The Heartbleed Hit List: The Passwords You Need to Change Right Now
Good list
-
An excellent list of aspects of the Heartbleed OpenSSL bug which need to be thought about/talked about/considered
(tags: heartbleed openssl bugs exploits security ssl tls web https)
Does the heartbleed vulnerability affect clients as severely?
'Yes, clients are vulnerable to attack. A malicious server can use the Heartbleed vulnerability to compromise an affected client.' Ouch.
R3: Announcing the next generation of Amazon EC2 Memory-optimized instances
pricing announced, and available in most regions
Scalable Atomic Visibility with RAMP Transactions
Great new distcomp protocol work from Peter Bailis et al:
We’ve developed three new algorithms—called Read Atomic Multi-Partition (RAMP) Transactions—for ensuring atomic visibility in partitioned (sharded) databases: either all of a transaction’s updates are observed, or none are. [...] How they work: RAMP transactions allow readers and writers to proceed concurrently. Operations race, but readers autonomously detect the races and repair any non-atomic reads. The write protocol ensures readers never stall waiting for writes to arrive. Why they scale: Clients can’t cause other clients to stall (via synchronization independence) and clients only have to contact the servers responsible for items in their transactions (via partition independence). As a consequence, there’s no mutual exclusion or synchronous coordination across servers. The end result: RAMP transactions outperform existing approaches across a variety of workloads, and, for a workload of 95% reads, RAMP transactions scale to over 7 million ops/second on 100 servers at less than 5% overhead.
(tags: scale synchronization databases distcomp distributed ramp transactions scalability peter-bailis protocols sharding concurrency atomic partitions)
-
wow, WFMU's "Beware of the Blog" did a really good job of assembling these classic 1960s Cambodian psych mp3s, featuring Sinn Sisamouth and Ros Sereysothea among others. I still love the start of Ma Pi Naok....
(tags: mp3 download music cambodia psychedelia wfmu 1960s sinn-sisamouth ros-sereysothea)
MICA: A Holistic Approach To Fast In-Memory Key-Value Storage [paper]
Very interesting new approach to building a scalable in-memory K/V store. As Rajiv Kurian notes on the mechanical-sympathy list: 'The basic idea is that each core is responsible for a portion of the key-space and requests are forwarded to the right core, avoiding multiple-writer scenarios. This is opposed to designs like memcache which uses locks and shared memory. Some of the things I found interesting: The single writer design is taken to an extreme. Clients assist the partitioning of requests, by calculating hashes before submitting GET requests. It uses Intel DPDK instead of sockets to forward packets to the right core, without processing the packet on any core. Each core is paired with a dedicated RX/TX queue. The design for a lossy cache is simple but interesting. It does things like replacing a hash slot (instead of chaining) etc. to take advantage of the lossy nature of caches. There is a lossless design too. A bunch of tricks to optimize for memory performance. This includes pre-allocation, design of the hash indexes, prefetching tricks etc. There are some other concurrency tricks that were interesting. Handling dangling pointers was one of them.' Source code here: https://github.com/efficient/mica
(tags: mica in-memory memory ram key-value-stores storage smp dpdk multicore memcached concurrency)
Google's Open Bidder stack moving from Jetty to Netty
Open Bidder traditionally used Jetty as an embedded webserver, for the critical tasks of accepting connections, processing HTTP requests, managing service threads, etc. Jetty is a robust, but traditional stack that carries the weight and tradeoffs of Servlet’s 15 years old design. For a maximum performance RTB agent that must combine very large request concurrency with very low latencies, and often benefit also from low-level control over the transport, memory management and other issue, a different webserver stack was required. Open Bidder now supports Netty, an asynchronous, event-driven, high-performance webserver stack. For existing code, the most important impact is that Netty is not compatible with the Servlet API. Its own internal APIs are often too low-level, not to mention proprietary to Netty; so Open Bidder v0.5 introduces some new, stack-neutral APIs for things like HTTP requests and responses, cookies, request handlers, and even simple HTML templating based on Mustache. These APIs will work with both Netty and Jetty. This means you don’t need to change any code to switch between Jetty and Netty; on the other hand, it also means that existing code written for Open Bidder 0.4 may need some changes even if you plan to keep using Jetty. [....] Netty's superior efficiency is very significant; it supports 50% more traffic in the same hardware, and it maintains a perfect latency distribution even at the peak of its supported load.
This doc is noteworthy on a couple of grounds: 1. the use of Netty in a public API/library, and the additional layer in place to add a friendlier API on top of that. I hope they might consider releasing that part as OSS at some point. 2. I also find it interesting that their API uses protobufs to marshal the message, and they plan in a future release to serialize those to JSON documents -- that makes a lot of sense.(tags: apis google protobufs json documents interoperability netty jetty servlets performance java)
The University Times: TCD Provost Under Pressure To “Re-think” Identity Initiative
Students, staff and alumni put pressure on Provost to reconsider changes to Trinity College Dublin's name and coat of arms.
alumni scholars from 2004 and 1994 who had been invited back for the dinner shouted ‘Dublin’ after the Provost welcomed them back to “Trinity College”.
Daring Fireball: Rethinking What We Mean by 'Mobile Web'
We shouldn’t think of “the web” as only what renders in web browsers. We should think of the web as anything transmitted using HTTP and HTTPS. Apps and websites are peers, not competitors. They’re all just clients to the same services.
+1. Finally, a Daring Fireball post I agree with! ;)(tags: daring-fireball apps web http https mobile apple android browsers)
The little ssh that (sometimes) couldn't - Mina Naguib
A good demonstration of what it looks like when network-level packet corruption occurs on a TCP connection
(tags: ssh sysadmin networking tcp bugs bit-flips cosmic-rays corruption packet)
DRI wins their case at the ECJ!
Great stuff!
The Court has found that data retention “entails a wide-ranging and particularly serious interference with the fundamental rights to respect for private life and to the protection of personal data” and that it “entails an interference with the fundamental rights of practically the entire European population”. TJ McIntyre, Chairman of Digital Rights Ireland, said that “This is the first assessment of mass surveillance by a supreme court since the Snowden revelations. The ECJ’s judgement finds that untargeted monitoring of the entire population is unacceptable in a democratic society.” [...] Though the Directive has now been struck down, the issue will remain live in all the countries who have passed domestic law to implement the data retention mass surveillance regime. Digital Rights Ireland’s challenge to the Irish data retention system will return to the High Court in Dublin for the next phase of litigation.
(tags: dri digital-rights ireland eu ecj surveillance snooping law data-retention)
-
a cron job monitoring tool that keeps an eye on your periodic processes and notifies you when something doesn't happen. Daily backups, monthly emails, or cron jobs you need to monitor? Dead Man's Snitch has you covered. Know immediately when one of these processes doesn't work.
via Marc. VERY high resolution scans of original Apollo 11 and Apollo 14 charts
the Apollo 11 ALO and LM Descent Monitoring charts are tidied up and downloadable
(tags: apollo space history memorabilia images scans science nasa)
Garbage Collection Optimization for High-Throughput and Low-Latency Java Applications
LinkedIn talk about the GC opts they used to optimize the Feed. good detail
(tags: performance optimization linkedin java jvm gc tuning)
Great comment from the "banking uses eventual consistency" thread...
'In the grand scheme of things, insurance trumps consistency.' Love it.
(tags: banking banks insurance eventual-consistency databases consistency transactions nosql)
S3 as a single-web-page application engine
neat hack. Pity it returns a 403 error code due to the misuse of the ErrorDocument feature though
(tags: s3 javascript single-page web html markdown hacks)
-
Tracking Irish News Stories Over Time; Irish NewsDiffs archives changes in articles after publication. Currently, we track rte.ie and irishtimes.com.
(tags: rte irish-times diffing diffs changes tracking newspapers news ireland history)
Here’s Why You’re Not Hiring the Best and the Brightest
Jeff Atwood's persuasive argument that remote working needs to be the norm in tech work:
There’s an elephant in the room in the form of an implied clause: Always hire the best people… who are willing to live in San Francisco. Substitute Mountain View, New York, Boston, Chicago, or any other city. The problem is the same. We pay lip service to the idea of hiring the best people in the world — but in reality, we’re only hiring the best people who happen to be close by.
(tags: recruiting remote hiring business coding work remote-work telecommuting jobs silicon-valley jeff-atwood)
The First Few Milliseconds of an HTTPS Connection
in excruciating detail
(tags: https tls ssl security http protocols packets networking)
-
For EVERY youtube video, I always open the video and then immediately punch the slider bar to about 30 percent. For example, in this video, it should have just started at :40. Everything before :40 was a waste. This holds true for nearly every video in the universe.
Via Joe Drumgoole -- it's true!(tags: via:joedrumgoole videos youtube skip wadsworth-constant fast-forward funny)
-
Advice from a developer who helped rebuild Walmart.ca with Scala and Play
This is really good advice.(tags: walmart scala java languages coding relearning play akka)
-
open source, system-level exploration: capture system state and activity from a running Linux instance, then save, filter and analyze. Think of it as strace + tcpdump + lsof + awesome sauce. With a little Lua cherry on top.
This sounds excellent. Linux-based, GPLv2.(tags: debugging tools linux ops tracing strace open-source sysdig cli tcpdump lsof)
Manhattan, our real-time, multi-tenant distributed database for Twitter scale | Twitter Blogs
Impressive, but a fierce whiff of "NIH" off of this
(tags: manhattan consistency database twitter eventual-consistency nosql voldemort cassandra riak time-series)
European Parliament passes strong net neutrality law, along with major roaming reforms
European fans of the open internet can breathe a sigh of relief: the European parliament has passed a major package of telecoms law reform, complete with amendments that properly define and protect net neutrality. The amendments were introduced by the Socialist, Liberal, Green and Left blocs in the European Parliament after the final committee to tweak the package – the industry committee – left in a bunch of loopholes that would have allowed telcos to start classifying web services of their choice as “specialized services” that they can treat differently. [...] Now the whole package gets passed through to the next Parliament (elections are coming up in May), then the representatives of European countries for final approval.
(tags: netneutrality eu ep europe neelie-kroes freedom isps telecom)
-
Finally got around to migrating this old CPAN module to github
(tags: cpan github ipc-dirqueue perl open-source hacks git svn)
-
an easily embeddable, decentralized, k-ordered unique ID generator. It can use the same encoded ID format as Twitter's Snowflake or Boundary's Flake implementations as well as any other customized encoding without too much effort. The fauxflake-core module has no external dependencies and is meant to be about as light as possible while still delivering useful functionality. Essentially, if you want to be able to generate a unique identifier across your infrastructure with reasonable assurances about collisions, then you might find this useful.
From the same guy as the excellent Guava Retrier library; java, ASL2-licensed open source.(tags: open-source java asl2 fauxflake tools libraries unique-ids ids unique snowflake distsys)
-
Another cool library from Roy Holder: 'an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.' Similar to his Guava-Retrier java lib, but using a decorator.
(tags: retrying python libraries tools backoff retry error-handling)
Redis adds support for HyperLogLog
good comment thread on HN, discussing hlld and bloomd as well
(tags: hll bloom-filters hyperloglog redis data-structures estimation cardinality probabilistic probability hashing random)
LastPass Sentry Warns You When Your Online Accounts Have Been Breached
This is a brilliant feature. It just sent a warning to a friend about an old account he was no longer using
7 Hidden Dublin Secrets You Probably Never Knew About
from Pól Ó Conghaile's "Secret Dublin". great stuff
(tags: dublin pol-o-conghaile history secrets ireland pranks)
-
must give this a try. performance will be the issue I suspect
(tags: metrics java annotations jersey hacking coding jetty)
How Gmail Happened: The Inside Story of Its Launch 10 Years Ago Today
the inside story of the great work done by Paul Buchheit, Kevin Fox, and Sanjeev Singh to reinvent email in 2004
(tags: history gmail email smtp mua paul-buchheit kevin-fox launches google web)
-
this HN thread on the age-old UDP vs TCP question is way better than the original post -- lots of salient comments
(tags: udp tcp games protocols networking latency internet gaming hackernews)
-
'a.k.a. Faster == Better'. Slides from a talk given at Facebook on 19th March 2014 by Norman Maurer
(tags: netty java performance optimization facebook slides presentations)
Daylight saving time linked to heart attacks, study finds
Switching over to daylight saving time, and losing one hour of sleep, raised the risk of having a heart attack the following Monday by 25 per cent, compared to other Mondays during the year, according to a new US study released today. [...] The study found that heart attack risk fell 21 per cent later in the year, on the Tuesday after the clock was returned to standard time, and people got an extra hour’s sleep.
One clear answer: we need 25-hour days. More details: http://www.sciencedaily.com/releases/2014/03/140329175108.htm --Researchers used Michigan's BMC2 database, which collects data from all non-federal hospitals across the state, to identify admissions for heart attacks requiring percutaneous coronary intervention from Jan. 1, 2010 through Sept. 15, 2013. A total of 42,060 hospital admissions occurring over 1,354 days were included in the analysis. Total daily admissions were adjusted for seasonal and weekday variation, as the rate of heart attacks peaks in the winter and is lowest in the summer and is also greater on Mondays and lower over the weekend. The hospitals included in this study admit an average of 32 patients having a heart attack on any given Monday. But on the Monday immediately after springing ahead there were on average an additional eight heart attacks. There was no difference in the total weekly number of percutaneous coronary interventions performed for either the fall or spring time changes compared to the weeks before and after the time change.
(tags: daylight dst daylight-savings time dates calendar science health heart-attacks michigan hospitals statistics)
Steve Jobs on the disease of believing that 90% of the work is having a great idea
Good quote
(tags: steve-jobs quotes entrepreneurs startups making products building ideas concepts apple history)
DNS results now being manipulated in Turkey
Deep-packet inspection and rewriting on DNS packets for Google and OpenDNS servers. VPNs and DNSSEC up next!
(tags: turkey twitter dpi dns opendns google networking filtering surveillance proxying packets udp)
Phusion Passenger now supports the new Ruby 2.1 Out-Of-Band GC
a reasonable workaround for Ruby's GC problems in web apps
(tags: ruby gc ops performance phusion passenger rails unicorn out-of-band web-services)
Duplicity + S3: easy, cheap, encrypted, automated full-disk backups for your servers
actually sounds quite nice
(tags: backups s3 aws servers duplicity ops duply unix linux)
-
This is a couple of years old, but I like this:
Turbo Boyer-Moore is disappointing, its name doesn’t do it justice. In academia constant overhead doesn’t matter, but here we see that it matters a lot in practice. Turbo Boyer-Moore’s inner loop is so complex that we think we’re better off using the original Boyer-Moore.
A good demo of how large values of O(n) can be slower than small values of O(mn).(tags: algorithms search strings coding big-o string-search searching)
-
in a field as critical and competitive as smartphones, Google’s R&D strategy was being dictated, not by the company’s board, or by its shareholders, but by a desire not to anger the CEO of a rival company.
This is utterly bananas and anti-competitive. (via Des Traynor)(tags: via:destraynor wage-fixing apple google tech paris r-and-d steve-jobs jean-marie-hullot france competition poaching assholes)
[#1259] Add optimized queue for SCMP pattern and use it in NIO and nativ... · 6efac61 · netty/netty
Interesting -- Netty has imported an optimized ASL2-licensed MPSC queue implementation from Akka (presumably for performance raisins)
(tags: performance optimization open-source mpsc queues data-structures netty akka java)
Ruby Garbage Collection: Still Not Ready for Production
disastrous GC bugs in Ruby, requiring horrible kludgy workarounds
-
[via Boing Boing:] A new, exhaustive report from Human Rights Watch details the way the young state of modern Ethiopia has become a kind of pilot program for the abuse of "off-the-shelf" surveillance, availing itself of commercial products from the US, the UK, France, Italy and China in order to establish an abusive surveillance regime that violates human rights and suppresses legitimate political opposition under the guise of a anti-terrorism law that's so broadly interpreted as to be meaningless. The 137 page report [from Human Rights Watch] details the technologies the Ethiopian government has acquired from several countries and uses to facilitate surveillance of perceived political opponents inside the country and among the diaspora. The government’s surveillance practices violate the rights to freedom of expression, association, and access to information. The government’s monopoly over all mobile and Internet services through its sole, state-owned telecom operator, Ethio Telecom, facilitates abuse of surveillance powers.
(tags: human-rights surveillance ethiopia spying off-the-shelf spyware big-brother hrw human-rights-watch)
Chinese cops cuff 1,500 in fake base station spam raid
The street finds its own uses for things, in this case Stinger/IMSI-catcher-type fake mobile-phone base stations:
Fake base stations are becoming a particularly popular modus operandi. Often concealed in a van or car, they are driven through city streets to spread their messages. The professional spammer in question charged 1,000 yuan (£100) to spam thousands of users in a radius of a few hundred metres. The pseudo-base station used could send out around 6,000 messages in just half an hour, the report said. Often such spammers are hired by local businessmen to promote their wares.
(via Bernard Tyers)(tags: stingers imsi-catcher mobile-phones mobile cellphones china spam via:bernard-tyers)
-
The most grave issue is that each recording likely amounted to a serious criminal offence. Under Irish law, the recording of a telephone conversation on a public network without the consent of at least one party to the call amounts to an "interception", a criminal offence carrying a possible term of imprisonment of up to five years. [...] Consequently, unless gardai were notified that their calls might be recorded then a large number of criminal offences are likely to have been committed by and within the Garda Siochana itself.
(tags: gubu surveillance gardai ags tjmcintyre bugging tapping phones ireland politics)
-
A cool-looking new debugging tool for C/C++ from Mozilla.
Many, many people have noticed that if we had a way to reliably record program execution and replay it later, with the ability to debug the replay, we could largely tame the nondeterminism problem. This would also allow us to deliberately introduce nondeterminism so tests can explore more of the possible execution space, without impacting debuggability. Many record and replay systems have been built in pursuit of this vision. (I built one myself.) For various reasons these systems have not seen wide adoption. So, a few years ago we at Mozilla started a project to create a new record-and-replay tool that would overcome the obstacles blocking adoption. We call this tool rr.
Low runtime overhead; easy deployability; targeted at 32-bit (?!) Linux; OSS. (via Bryan O'Sullivan)(tags: via:bos mozilla debugging coding firefox rr record replay gdb c++ linux)
-
AIB now have a dedicated customer-support forum on Boards.ie. That is a *great* idea
Microservices and nanoservices
A great reaction to Martin Fowler's "microservices" coinage, from Arnon Rotem-Gal-Oz: 'I guess it is easier to use a new name (Microservices) rather than say that this is what SOA actually meant'; 'these are the very principles of SOA before vendors does pushed the [ESB] in the middle.' Others have also chosen to define microservices slightly differently, as a service written in 10-100 LOC. Arnon's reaction: “Nanoservice is an antipattern where a service is too fine-grained. A nanoservice is a service whose overhead (communications, maintenance, and so on) outweighs its utility.” Having dealt with maintaining an over-fine-grained SOA stack in Amazon, I can only agree with this definition; it's easy to make things too fine-grained and create a raft of distributed-computing bugs and deployment/management complexity where there is no need to do so.
(tags: architecture antipatterns nanoservices microservices soa services design esb)
-
slightly ruined by the inclusion of some "deliberately Turing-complete" systems
(tags: turing computation software via:jwz turing-complete accidents automatons)
-
The next step in the Turkish twitter-block arms race.
Bridge relays (or "bridges" for short) are Tor relays that aren't listed in the main Tor directory. Since there is no complete public list of them, even if your ISP is filtering connections to all the known Tor relays, they probably won't be able to block all the bridges. If you suspect your access to the Tor network is being blocked, you may want to use the bridge feature of Tor. The addition of bridges to Tor is a step forward in the blocking resistance race. It is perfectly possible that even if your ISP filters the Internet, you do not require a bridge to use Tor. So you should try to use Tor without bridges first, since it might work.
(tags: tor privacy turkey bridging networking tor-bridges twitter filtering blocking censorship)
Adrian Cockroft's Cloud Outage Reports Collection
The detailed summaries of outages from cloud vendors are comprehensive and the response to each highlights many lessons in how to build robust distributed systems. For outages that significantly affected Netflix, the Netflix techblog report gives insight into how to effectively build reliable services on top of AWS. [....] I plan to collect reports here over time, and welcome links to other write-ups of outages and how to survive them.
(tags: outages post-mortems documentation ops aws ec2 amazon google dropbox microsoft azure incident-response)
-
This looks like an excellent new feature for parents:
A supervised user is a special type of Chrome user who can browse the web with guidance. Under the supervision of the manager, a supervised user can browse the web and sign in to websites. Supervised users don't need a Google Account or an email address because the manager creates a profile for the supervised user through the manager's Google Account. As a manager of a supervised user, you can see the user’s browsing history, block specific sites, and approve which sites the user can see, all from the supervised users dashboard that is accessible from any browser.
(tags: users chrome supervision parental-control parents safety web browsing kids)
The Stony Brook Algorithm Repository
This WWW page is intended to serve as a comprehensive collection of algorithm implementations for over seventy of the most fundamental problems in combinatorial algorithms. The problem taxonomy, implementations, and supporting material are all drawn from my [ie. Steven Skiena's] book 'The Algorithm Design Manual'. Since the practical person is more often looking for a program than an algorithm, we provide pointers to solid implementations of useful algorithms, when they are available.
(tags: algorithms reference coding steven-skiena combinatorial cs)
The Overprotected Kid - The Atlantic
Great article.
There is a big difference between avoiding major hazards and making every decision with the primary goal of optimizing child safety (or enrichment, or happiness). We can no more create the perfect environment for our children than we can create perfect children. To believe otherwise is a delusion, and a harmful one; remind yourself of that every time the panic rises.
(tags: child-safety parenting safety kids education risk danger playgrounds the-land)
Issue 122 - android-query - HTTP 204 Response results in Network Error (-101)
an empty 204 response to a HTTP PUT will trigger this. See also https://code.google.com/p/android/issues/detail?id=24672, '"java.io.IOException: unexpected end of stream" on HttpURLConnection HEAD call'.
(tags: http urlconnection httpurlconnection java android dalvik bugs 204 head get exceptions)
-
'The European election will take place between 22 and 25 May 2014. Citizens, promise to vote for candidates that have signed a 10-point charter of digital rights! Show candidates that they need to earn your vote by signing our charter!'
(tags: europarl ep digital-rights rights ireland eu data-privacy data-protection privacy)
-
amazing 80's-VHS-feel animated GIFs from Gustavo Torres, a video artist from Argentina
(tags: anigifs animated gif retro cyberpunk gustavo-torres kidmograph fx video-art via:mlkshk)
-
'Microsoft went through a blogger’s private Hotmail account in order to trace the identity of a source who allegedly leaked trade secrets.' Bear in mind that the alleged violation which MS allege allows them to read their email was a breach of the terms of service, which also include distribution of content which 'incites, advocates, or expresses pornography, obscenity, vulgarity, [or] profanity'. So no dirty jokes on Hotmail!
(tags: hotmail fail scroogled microsoft stupid tos law privacy data-protection trade-secrets ip)
Theresa May warns Yahoo that its move to Dublin is a security worry
Y! is moving to Dublin to evade GCHQ spying on its users. And what is the UK response?
"There are concerns in the Home Office about how Ripa will apply to Yahoo once it has moved its headquarters to Dublin," said a Whitehall source. "The home secretary asked to see officials from Yahoo because in Dublin they don't have equivalent laws to Ripa. This could particularly affect investigations led by Scotland Yard and the national crime agency. They regard this as a very serious issue."
There's priorities for you!(tags: ripa gchq guardian uk privacy data-protection ireland dublin london spying surveillance yahoo)
A Look At Airbnb’s Irish Pub-Inspired Office In Dublin - DesignTAXI.com
Very nice, Airbnb!
Internet Tolls And The Case For Strong Net Neutrality
Netflix CEO Reed Hastings blogs about the need for Net Neutrality:
Interestingly, there is one special case where no-fee interconnection is embraced by the big ISPs -- when they are connecting among themselves. They argue this is because roughly the same amount of data comes and goes between their networks. But when we ask them if we too would qualify for no-fee interconnect if we changed our service to upload as much data as we download** -- thus filling their upstream networks and nearly doubling our total traffic -- there is an uncomfortable silence. That's because the ISP argument isn't sensible. Big ISPs aren't paying money to services like online backup that generate more upstream than downstream traffic. Data direction, in other words, has nothing to do with costs. ISPs around the world are investing in high-speed Internet and most already practice strong net neutrality. With strong net neutrality, new services requiring high-speed Internet can emerge and become popular, spurring even more demand for the lucrative high-speed packages ISPs offer. With strong net neutrality, everyone avoids the kind of brinkmanship over blackouts that plague the cable industry and harms consumers. As the Wall Street Journal chart shows, we're already getting to the brownout stage. Consumers deserve better.
(tags: consumer net-neutrality comcast netflix protectionism cartels isps us congestion capacity)
Micro jitter, busy waiting and binding CPUs
pinning threads to CPUs to reduce jitter and latency. Lots of graphs and measurements from Peter Lawrey
The Day Today - Pool Supervisor - YouTube
"in 1979, no-one died. in 1980, some one died. in 1981, no-one died. in 1982, no-one died. ... I could go on"
(tags: the-day-today no-one-died safety pool supervisor tricky-word-puzzles funny humour classic video)
The colossal arrogance of Newsweek’s Bitcoin “scoop” | Ars Technica
Many aspects of the story already look like a caricature of journalism gone awry. The man Goodman fingered as being worth $400 million or more is just as modest as his house suggests. He’s had a stroke and struggles with other health issues. Unemployed since 2001, he strives to take care of basic needs for himself and his 93-year-old mother, according to a reddit post by his brother Arthur Nakamoto (whom Goodman quoted as calling his brother an “asshole”). If Goodman has mystery evidence supporting the Dorian Nakamoto theory, it should have been revealed days ago. Otherwise, Newsweek and Goodman are delaying an inevitable comeuppance and doubling down on past mistakes. Nakamoto’s multiple denials on the record have changed the dynamic of the story. Standing by the story, at this point, is an attack on him and his credibility. The Dorian Nakamoto story is a “Dewey beats Truman” moment for the Internet age, with all of the hubris and none of the humor. It shouldn’t be allowed to end in the mists of “he said, she said.” Whether or not a lawsuit gets filed, Nakamoto v. Newsweek faces an imminent verdict in the court of public opinion: either the man is lying or the magazine is wrong.
(tags: dorian-nakamoto newsweek journalism bitcoin privacy satoshi-nakamoto)
-
While going through her papa's old belongings, a young girl discovered something incredible - a mind-bogglingly intricate maze that her father had drawn by hand 30 years ago. While working as a school janitor it had taken him 7 years to produce the piece, only for it to be forgotten about... until now.
34" x 24" print, $40 Continuous Delivery with ETL Systems
Lonely Planet and Dr Foster Intelligence both make heavy use of ETL in their products, and both organisations have applied the principles of Continuous Delivery to their delivery process. Some of the Continuous Delivery norms need to be adapted in the context of ETL, and some interesting patterns emerge, such as running Continuous Integration against data, as well as code.
(tags: etl video presentations lonely-planet dr-foster-intelligence continuous-delivery deployment pipelines)
-
'On March 9th a group posted a data leak, which included the trading history of all MtGox users from April 2011 to November 2013. The graphs below explore the trade behaviors of the 500 highest volume MtGox users from the leaked data set. These are the Bitcoin barons, wealthy speculators, dueling algorithms, greater fools, and many more who took bitcoin to the moon.'
(tags: dataviz stamen bitcoin data leaks mtgox greater-fools)
What We Know 2/5/14: The Mt. Chiliad Mystery
hats off to Rockstar -- GTA V has a great mystery mural with clues dotted throughout the game, and it's as-yet unsolved
(tags: mysteries gaming via:hilary_w games gta gta-v rockstar mount-chiliad ufos)
Make Your Own 3-D Printer Filament From Old Milk Jugs
Creating your own 3-D printer filament from old used milk jugs is exponentially cheaper, and uses considerably less energy, than buying new filament, according to new research from Michigan Technological University. [...] The savings are really quite impressive — 99 cents on the dollar, in addition to the reduced use of energy. Interestingly (but again not surprisingly), the amount of energy used to ‘recycle’ the old milk jugs yourself is considerably less than that used in recycling such jugs conventionally.
-
This is a really good post on governmental computing, open data, and so on:
The fact that I can go months hearing about "open data" without a single mention of ETL is a problem. ETL is the pipes of your house: it's how you open data.
(tags: civic open-data government etl data-pipeline tech via:timoreilly)
-
as TJ McIntyre noted: '€100 fine for a repeat spammer. Data Protection Commissioner calls this "strong protection". With a straight face.' Next will doubtless fork over the 100 Euros out of the petty cash drawer, then carry on regardless. This isn't a useful fine. What a farce...
(tags: cheap farce dpc data-protection privacy anti-spam next spam convictions fines ireland)
-
The mass surveillance methods employed in [the UK, USA, and India], many of them exposed by NSA whistleblower Edward Snowden, are all the more intolerable because they will be used and indeed are already being used by authoritarians countries such as Iran, China, Turkmenistan, Saudi Arabia and Bahrain to justify their own violations of freedom of information. How will so-called democratic countries will able to press for the protection of journalists if they adopt the very practices they are criticizing authoritarian regimes for?
This is utterly jaw-dropping -- throughout the world, real-time mass-monitoring infrastructure is silently being dropped into place. France and India are particularly pervasive(tags: journalism censorship internet france india privacy data-protection surveillance spying law snowden authoritarianism)
The Microservice Declaration of Independence
"Microservices" seems to be yet another term for SOA; small, decoupled, independently-deployed services, with well-defined public HTTP APIs. Pretty much all the services I've worked on over the past few years have been built in this style. Still, let's keep an eye on this concept anyway. Another definition seems to be a more FP-style one: http://www.slideshare.net/michaelneale/microservices-and-functional-programming -- where the "microservice" does one narrowly-defined thing, and that alone.
(tags: microservices soa architecture handwaving http services web deployment)
No, Nate, brogrammers may not be macho, but that’s not all there is to it
Great essay on sexism in tech, "brogrammer" culture, "clubhouse chemistry", outsiders, wierd nerds and exclusion:
Every group, including the excluded and disadvantaged, create cultural capital and behave in ways that simultaneously create a sense of belonging for them in their existing social circle while also potentially denying them entry into another one, often at the expense of economic capital. It’s easy to see that wearing baggy, sagging pants to a job interview, or having large and visible tattoos in a corporate setting, might limit someone’s access. These are some of the markers of belonging used in social groups that are often denied opportunities. By embracing these markers, members of the group create real barriers to acceptance outside their circle even as they deepen their peer relationships. The group chooses to adopt values that are rejected by the society that’s rejecting them. And that’s what happens to “weird nerd” men as well—they create ways of being that allow for internal bonding against a largely exclusionary backdrop.
(via Bryan O'Sullivan)(tags: nerds outsiders exclusion society nate-silver brogrammers sexism racism tech culture silicon-valley essays via:bos31337)
Impact of large primitive arrays (BLOBS) on JVM Garbage Collection
some nice graphs and data on CMS performance, with/without -XX:ParGCCardsPerStrideChunk
(tags: cms java jvm performance optimization tuning off-heap-storage memory)
Anatomical Collages by Travis Bedel
these are fantastic
-
a utility to perform parallel, pipelined execution of a single HTTP GET. htcat is intended for the purpose of incantations like: htcat https://host.net/file.tar.gz | tar -zx It is tuned (and only really useful) for faster interconnects: [....] 109MB/s on a gigabit network, between an AWS EC2 instance and S3. This represents 91% use of the theoretical maximum of gigabit (119.2 MiB/s).
-
Abe Stanway crunches the stats on Citibike usage in NYC, compared to the weather data from Wunderground.
(tags: data correlation statistics citibike cycling nyc data-science weather)
NSA surveillance recording every single voice call in at least 1 country
Storing them in a 30-day rolling buffer, allowing retrospective targeting weeks after the call. 100% of all voice calls in that country, although it's unclear which country that is
(tags: nsa surveillance gchq telephones phone bugging)
-
a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL effectively provides a hard disk of dynamic, infinite capacity that can be accessed from any computer with internet access running Linux, FreeBSD or OS-X. S3QL is a standard conforming, full featured UNIX file system that is conceptually indistinguishable from any local file system. Furthermore, S3QL has additional features like compression, encryption, data de-duplication, immutable trees and snapshotting which make it especially suitable for online backup and archival.
(tags: s3 s3ql backup aws filesystems linux freebsd osx ops)
-
good explanation of all the new features -- I'm really looking forward to fixing up all the crappy over-verbose interface-as-lambdas we have scattered throughout our code
(tags: java java8 lambdas fp functional-programming currying joda-time)
-
a compressed full-text substring index based on the Burrows-Wheeler transform, with some similarities to the suffix array. It was created by Paolo Ferragina and Giovanni Manzini,[1] who describe it as an opportunistic data structure as it allows compression of the input text while still permitting fast substring queries. The name stands for 'Full-text index in Minute space'. It can be used to efficiently find the number of occurrences of a pattern within the compressed text, as well as locate the position of each occurrence. Both the query time and storage space requirements are sublinear with respect to the size of the input data.
kragen notes 'gene sequencing is using [them] in production'.(tags: sequencing bioinformatics algorithms bowtie fm-index indexing compression search burrows-wheeler bwt full-text-search)
How to turn your smartphone photos from good to great
some good tips
(tags: phone pictures photos tips smartphone iphone android)
How the Irish helped weave the web
Nice Irish Times article on the first 3 web servers in Ireland -- including the one I set up at Iona Technologies. 21 years ago!
(tags: history ireland tech web internet www james-casey peter-flynn irish-times iona-technologies)
Health privacy: formal complaint to ICO
'Light Blue Touchpaper' notes:
Three NGOs have lodged a formal complaint to the Information Commissioner about the fact that PA Consulting uploaded over a decade of UK hospital records to a US-based cloud service. This appears to have involved serious breaches of the UK Data Protection Act 1998 and of multiple NHS regulations about the security of personal health information.
Let's see if ICO can ever do anything useful.... not holding my breath(tags: ico privacy data-protection dpa nhs health data ross-anderson)
Why Google Flu Trends Can't Track the Flu (Yet)
It's admittedly hard for outsiders to analyze Google Flu Trends, because the company doesn't make public the specific search terms it uses as raw data, or the particular algorithm it uses to convert the frequency of these terms into flu assessments. But the researchers did their best to infer the terms by using Google Correlate, a service that allows you to look at the rates of particular search terms over time. When the researchers did this for a variety of flu-related queries over the past few years, they found that a couple key searches (those for flu treatments, and those asking how to differentiate the flu from the cold) tracked more closely with Google Flu Trends' estimates than with actual flu rates, especially when Google overestimated the prevalence of the ailment. These particular searches, it seems, could be a huge part of the inaccuracy problem. There's another good reason to suspect this might be the case. In 2011, as part of one of its regular search algorithm tweaks, Google began recommending related search terms for many queries (including listing a search for flu treatments after someone Googled many flu-related terms) and in 2012, the company began providing potential diagnoses in response to symptoms in searches (including listing both "flu" and "cold" after a search that included the phrase "sore throat," for instance, perhaps prompting a user to search for how to distinguish between the two). These tweaks, the researchers argue, likely artificially drove up the rates of the searches they identified as responsible for Google's overestimates.
via Boing Boing(tags: google flu trends feedback side-effects colds health google-flu-trends)
Implementing a web server in a single printf() call
clever hack -- shellcode in a format string
(tags: printf hax coding web shellcode exploits assembly linux)
Ucas sells access to student data for phone and drinks firms' marketing | Technology | The Guardian
The UK government's failure to deal with spam law in a consumer-friendly way escalates further: UCAS, the university admissions service, is operating as a mass-mailer of direct marketing on behalf of Vodafone, O2, Microsoft, Red Bull and others, without even a way to later opt out from that spam without missing important admissions-related mail as a side effect. 'Teenagers using Ucas Progress must explicitly opt in to mailings from the organisation and advertisers, though the organisation's privacy statement says: "We do encourage you to tick the box as it helps us to help you."' Their website also carries advertising, and the details of parents are sold on to advertisers as well. Needless to say, the toothless ICO say they 'did not appear to breach marketing rules under the privacy and electronic communications regulations', as usual. Typical ICO fail.
(tags: ucas advertising privacy data-protection opt-in opt-out spam direct-marketing vodafone o2 microsoft red-bull uk universities grim-meathook-future ico)
Good explanation of exponential backoff
I've often had to explain this key feature verbosely, and it's hard to do without handwaving. Great to have a solid, well-explained URL to point to
(tags: exponential-backoff backoff retries reliability web-services http networking internet coding design)
Sacked Google worker says staff ratings fixed to fit template
Allegations of fixing to fit the stack-ranking curve: 'someone at Google always had to get a low score “of 2.9”, so the unit could match the bell curve. She said senior staff “calibrated” the ratings supplied by line managers to ensure conformity with the template and these calibrations could reduce a line manager’s assessment of an employee, in effect giving them the poisoned score of less than three.'
(tags: stack-ranking google ireland employment work bell-curve statistics eric-schmidt)
Corporate Tax 2014: Irish Government's "flawed premise" on Apple's avoidance
According to our calculation about €40bn or over 40% of Irish services exports of €90bn in 2012 and related national output, resulted from global tax avoidance schemes. It is true that Ireland gains little from tax cheating but at some point, the US tax system will be reformed and a territorial system where companies are only liable in the US on US profits, would only be viable if there was a disincentive to shift profits to non-tax or low tax countries. The risk for Ireland is that a minimum foreign tax would be introduced that would be greater than the Irish headline rate of 12.5%. It's also likely that US investment in Ireland would not have been jeopardized if Irish politicians had not been so eager as supplicants to doff the cap. Nevertheless today it would be taboo to admit the reality of participation in massive tax avoidance and the Captain Renaults of Merrion Street will continue with their version of the Dance of the Seven Veils.
(tags: apple tax double-irish tax-avoidance google investment itax tax-evasion ireland)
An online Magna Carta: Berners-Lee calls for bill of rights for web
TimBL backing the "web we want" campaign -- https://webwewant.org/
(tags: freedom gchq nsa censorship internet privacy web-we-want human-rights timbl tim-berners-lee)
How the search for flight AF447 used Bayesian inference
Via jgc, the search for the downed Air France flight was optimized using this technique: 'Metron’s approach to this search planning problem is rooted in classical Bayesian inference, which allows organization of available data with associated uncertainties and computation of the Probability Distribution Function (PDF) for target location given these data. In following this approach, the first step was to gather the available information about the location of the impact site of the aircraft. This information was sometimes contradictory and filled with ambiguities and uncertainties. Using a Bayesian approach we organized this material into consistent scenarios, quantified the uncertainties with probability distributions, weighted the relative likelihood of each scenario, and performed a simulation to produce a prior PDF for the location of the wreck.'
(tags: metron bayes bayesian-inference machine-learning statistics via:jgc air-france disasters probability inference searching)
How the NSA Plans to Infect 'Millions' of Computers with Malware - The Intercept
The implants being deployed were once reserved for a few hundred hard-to-reach targets, whose communications could not be monitored through traditional wiretaps. But the documents analyzed by The Intercept show how the NSA has aggressively accelerated its hacking initiatives in the past decade by computerizing some processes previously handled by humans. The automated system – codenamed TURBINE – is designed to “allow the current implant network to scale to large size (millions of implants) by creating a system that does automated control implants by groups instead of individually.” In a top-secret presentation, dated August 2009, the NSA describes a pre-programmed part of the covert infrastructure called the “Expert System,” which is designed to operate “like the brain.”
Great. Automated malware deployment to millions of random victims. See also the "I hunt sysadmins" section further down...(tags: malware gchq nsa oversight infection expert-systems turbine false-positives the-intercept surveillance)
-
Burrito Justice nerds out on 'Goodnight Moon'. 'Maybe the bunny and the old lady are actually in a space elevator, getting closer to the moon as he gets into bed? Or as suggested by @transitmaps, the bunny can bend space and time? I do not have a good answer to this conundrum, but that is what the comments are for.'
(tags: goodnight-moon moon space time space-elevators childrens-books books physics)
Inside the Mind of an anti-fluoridationist
An exceptionally well-researched and thorough disassembly of 'Public Health Investigation of Epidemiological data on Disease and Mortality in Ireland related to Water Fluoridation and Fluoride Exposure' by Declan Waugh, which appears to be going around currently
(tags: declan-waugh debunking flouride flouridation science mortality health ireland water)
David Robert Grimes on the flouride kerfuffle
Hilariously, "The Girl Against Flouride" and other antiflouridation campaigners now allege he's a undercover agent of Alcoa and/or Glaxo Smith Kline, rather than dealing with any awkwardly hostile realities
(tags: flouride flouridation david-robert-grimes conspiracy funny science ireland alcoa glaxo-smith-kline)
-
fantastic piece of C=64 history -- the "Ocean fast loader" by Paul Hughes, which allowed Commodore 64 games to load from tape at 4000 baud, far faster than the built-in system implementation, and with graphics and music at the same time
(tags: ocean-loader tapes c=64 commodore-64 history 1980s freeload paul-hughes)
IntelliJ IDEA 13.1 will support Chronon Debugger
This, IMO, would be a really good reason to upgrade to the payware version of IDEA - Chronon looks cool.
Chronon is a new revolutionary tool keeping track of running Java programs and recording their execution process for later analysis, which can be helpful when you need to thoroughly retrace your steps when dealing with complicated bugs.
(tags: chronon debugging java intellij idea ides coding time-warp time)
"Dapper, a Large-Scale Distributed Systems Tracing Infrastructure" [PDF]
Google paper describing the infrastructure they've built for cross-service request tracing (ie. "tracer requests"). Features: low code changes required (since they've built it into the internal protobuf libs), low performance impact, sampling, deployment across the ~entire production fleet, output visibility in minutes, and has been live in production for over 2 years. Excellent read
(tags: dapper tracing http services soa google papers request-tracing tracers protobuf devops)
-
'a Japanese term that means "mistake-proofing". A poka-yoke is any mechanism in a lean manufacturing process that helps an equipment operator avoid (yokeru) mistakes (poka). Its purpose is to eliminate product defects by preventing, correcting, or drawing attention to human errors as they occur.'
(tags: human-error errors mistakes poka-yoke failures prevention bugproofing manufacturing japan)
The Hands That Made The Moomins
lovely New Yorker writeup on Tove Jansson, author of those beautiful children's books
(tags: tove-jansson moomins books childrens-books reading literature via:etienneshrdlu)
James Casey writes about working at CERN
I am very heartened by Minister of State for Research and Innovation Sean Sherlock’s recent announcement of a review of the costs and benefits of Ireland’s membership of international research organisations including CERN. I disagreed with the conclusion of the last review which suggested that costs outweighed the benefits to Ireland. I think it was an extreme oversight not to be a part of the engineering phase of the Collider during the period 1998-2008 – but it’s not too late. CERN will celebrate its 60th anniversary in 2014. There is no public scientific institution its equal in terms of the scale and complexity of problems being analysed and solved. No longer excluding young Irish people from being a part of this, from learning and growing from it, can only help Ireland.
Also, spot my name in lights ;)(tags: ireland cern science europe eu sean-sherlock james-casey www web history)
Digging for cryptocurrency: The newbie’s guide to mining altcoins
Mining Arscoins, dogecoins and litecoins -- CPU/GPU mining apps and how to run 'em
(tags: currency bitcoin altcoins dogecoin crypto mining ars-technica)
A cautionary tale about building large-scale polyglot systems
'a fucking nightmare':
Cascading requires a compilation step, yet since you're writing Ruby code, you get get none of the benefits of static type checking. It was standard to discover a type issue only after kicking off a job on, oh, 10 EC2 machines, only to have it fail because of a type mismatch. And user code embedded in strings would regularly fail to compile – which you again wouldn't discover until after your job was running. Each of these were bad individually, together, they were a fucking nightmare. The interaction between the code in strings and the type system was the worst of all possible worlds. No type checking, yet incredibly brittle, finicky and incomprehensible type errors at run time. I will never forget when one of my friends at Etsy was learning Cascading.JRuby and he couldn't get a type cast to work. I happened to know what would work: a triple cast. You had to cast the value to the type you wanted, not once, not twice, but THREE times.
(tags: etsy scalding cascading adtuitive war-stories languages polyglot ruby java strong-typing jruby types hadoop)
-
Attempting to cash out of Bitcoins turns out to be absurdly difficult:
Trying to sell the coins in person, and basically saying he ether wants Cash, or a Cashiers check (since it can be handed over right then and there), has apparently been a hilarious clusterfuck. Today he met some guy infront of his bank, and apparently as soon as he mentioned that he needs to get the cash checked to make sure it is not counterfeit, the guy freaked out and basically walked away. Stuff like this has been happening all week, and he apparently so far has only sold a single coin of several hundred.
(tags: bitcoin fail funny mtgox fraud cash fiat-currency via:rsynnott buttcoin)
Florida cops used IMSI catchers over 200 times without a warrant
Harris is the leading maker of [IMSI catchers aka "stingrays"] in the U.S., and the ACLU has long suspected that the company has been loaning the devices to police departments throughout the state for product testing and promotional purposes. As the court document notes in the 2008 case, “the Tallahassee Police Department is not the owner of the equipment.” The ACLU now suspects these police departments may have all signed non-disclosure agreements with the vendor and used the agreement to avoid disclosing their use of the equipment to courts. “The police seem to have interpreted the agreement to bar them even from revealing their use of Stingrays to judges, who we usually rely on to provide oversight of police investigations,” the ACLU writes.
(tags: aclu police stingrays imsi-catchers privacy cellphones mobile-phones security wired)
The Netflix Dynamic Scripting Platform
At the core of the redesign is a Dynamic Scripting Platform which provides us the ability to inject code into a running Java application at any time. This means we can alter the behavior of the application without a full scale deployment. As you can imagine, this powerful capability is useful in many scenarios. The API Server is one use case, and in this post, we describe how we use this platform to support a distributed development model at Netflix.
Holy crap.(tags: scripting dynamic-languages groovy java server-side architecture netflix)
ZooKeeper Resilience at Pinterest
essentially decoupling the client services from ZK using a local daemon on each client host; very similar to Airbnb's Smartstack. This is a bit of an indictment of ZK's usability though
(tags: ops architecture clustering network partitions cap reliability smartstack airbnb pinterest zookeeper)
FOI is better than tea and biscuits
Good post on the 'FOI costs too much' talking point.
I realise if you’re a councillor, tea and biscuits sounds much more appealing than transparency and being held accountable and actually having to answer to voters, but those things are what you signed up to when you stood for election.
(tags: foi open-data politics government funding)
Answer to How many topics (queues) can be created in Apache Kafka? - Quora
Good to know:
'As far as I understand (this was true as of 2013, when I last looked into this issue) there's at least one Apache ZooKeeper znode per topic in Kafka. While there is no hard limitation in Kafka itself (Kafka is linearly scalable), it does mean that the maximum number of znodes comfortable supported by ZooKeeper (on the order of about ten thousand) is the upper limit of Kafka's scalability as far as the number of topics goes.'
(tags: kafka queues zookeeper znodes architecture)
Care.data is in chaos. It breaks my heart | Ben Goldacre
There are people in my profession who think they can ignore this problem. Some are murmuring that this mess is like MMR, a public misunderstanding to be corrected with better PR. They are wrong: it's like nuclear power. Medical data, rarefied and condensed, presents huge power to do good, but it also presents huge risks. When leaked, it cannot be unleaked; when lost, public trust will take decades to regain. This breaks my heart. I love big medical datasets, I work on them in my day job, and I can think of a hundred life-saving uses for better ones. But patients' medical records contain secrets, and we owe them our highest protection. Where we use them – and we have used them, as researchers, for decades without a leak – this must be done safely, accountably, and transparently. New primary legislation, governing who has access to what, must be written: but that's not enough. We also need vicious penalties for anyone leaking medical records; and HSCIC needs to regain trust, by releasing all documentation on all past releases, urgently. Care.data needs to work: in medicine, data saves lives.
(tags: hscic nhs care.data data privacy data-protection medicine hospitals pr)
-
bookmarking as a future reference
(tags: timezones time world clock xkcd images midnight reference)
Only 0.15 percent of mobile gamers account for 50 percent of all in-game revenue
Nice bit of marketing from the day job:
The group of gamers responsible for half of all in-game revenue in mobile titles is frightening because it is so narrow, according to a survey by Swrve, an established analytics and app marketing firm. About 0.15 percent of mobile gamers contribute 50 percent of all of the in-app purchases generated in free-to-play games. This means it may even more important than game companies realized in the past to find and retain the users that fall into the category of big spenders, or “whales.” The vast majority of users never spend any money, despite the clever tactics that game publishers have developed to incentivize people to spend money in their favorite games.
(tags: swrve whales gaming games iap money mobile analytics)
-
'EAT CELEBRITY MEAT! BiteLabs grows meat from celebrity tissue samples and uses it to make artisanal salami.' Genius. (via John Looney)
(tags: via:john-looney meat startups food funny salami tissue-samples celebrity jennifer-lawrence)
'Bobtail: Avoiding Long Tails in the Cloud' [pdf]
'A system that proactively detects and avoids bad neighbouring VMs without significantly penalizing node instantiation [in EC2]. With Bobtail, common [datacenter] communication patterns benefit from reductions of up to 40% in 99.9th percentile response times.' Excellent stuff -- another conclusion they come to is that it's not the network's fault, it's the Xen hosts themselves. The EC2 networking team will be happy about that ;)
(tags: networking ec2 bobtail latency long-tail xen performance)
-
Charlie Stross on GCHQ's 1984-esque webcam spying
(tags: webcams porn charlie-stross funny 1984 dystopian masturbation surveillance spying)
Big doubts on big data: Why I won't be sharing my medical data with anyone - yet
These problems can be circumvented, but they must be dealt with, publically and soberly, if the NHS really does want to win public confidence. The NHS should approach selling the scheme to the public as if was opt-in, not opt-out, then work to convince us to join it. Tell us how sharing our data can help, but tell us what risk too. Let us decide if that balance is worth it. If it's found wanting, the NHS must go back to the drawing board and retool the scheme until it is. It's just too important to get wrong.
(tags: nhs uk privacy data-protection data-privacy via:mynosql big-data healthcare insurance)
Welcome to Algorithmic Prison - Bill Davidow - The Atlantic
"Computer says no", taken to the next level.
Even if an algorithmic prisoner knows he is in a prison, he may not know who his jailer is. Is he unable to get a loan because of a corrupted file at Experian or Equifax? Or could it be TransUnion? His bank could even have its own algorithms to determine a consumer’s creditworthiness. Just think of the needle-in-a-haystack effort consumers must undertake if they are forced to investigate dozens of consumer-reporting companies, looking for the one that threw them behind algorithmic bars. Now imagine a future that contains hundreds of such companies. A prisoner might not have any idea as to what type of behavior got him sentenced to a jail term. Is he on an enhanced screening list at an airport because of a trip he made to an unstable country, a post on his Facebook page, or a phone call to a friend who has a suspected terrorist friend?
(tags: privacy data big-data algorithms machine-learning equifax experian consumer society bill-davidow)
RTE star Sharon Ni Bheolain stalked for six months - Independent.ie
as @Fergal says: '[this] case shows (a) the internet isn't anonymous, (b) we [ie. Ireland -jm] have laws to deal with threats and harassment'
(tags: law ireland harassment internet twitter email abuse cyberstalking)
ImperialViolet - Apple's SSL/TLS bug
as we all know by now, a misplaced "goto fail" caused a critical, huge security flaw in versions of IOS and OSX SSL, since late 2012. Lessons: 1. unit test the failure cases, particularly for critical security code! 2. use braces. 3. dead-code analysis would have caught this. I'm not buying the "goto considered harmful" line, though, since any kind of control flow structure would have had the same problem.
(tags: coding apple osx ios crypto ssl security goto-fail goto fail unit-testing coding-standards)
Comcast’s deal with Netflix makes network neutrality obsolete
in a world where Netflix and Yahoo connect directly to residential ISPs, every Internet company will have its own separate pipe. And policing whether different pipes are equally good is a much harder problem than requiring that all of the traffic in a single pipe be treated the same. If it wanted to ensure a level playing field, the FCC would be forced to become intimately involved in interconnection disputes, overseeing who Verizon interconnects with, how fast the connections are and how much they can charge to do it.
(tags: verizon comcast internet peering networking netflix network-neutrality)
Data visualization: breaking down The Economist's classic chart style
nice piece of classic graph design
Netflix packets being dropped every day because Verizon wants more money | Ars Technica
With Cogent and Verizon fighting, [peering capacity] upgrades are happening at a glacial pace, according to Schaeffer. "Once a port hits about 85 percent throughput, you're going to begin to start to drop packets," he said. "Clearly when a port is at 120 or 130 percent [as the Cogent/Verizon ones are] the packet loss is material." The congestion isn't only happening at peak times, he said. "These ports are so over-congested that they're running in this packet dropping state 22, 24 hours a day. Maybe at four in the morning on Tuesday or something there might be a little bit of headroom," he said.
(tags: packet-loss networking internet cogent netflix verizon peering)
Hospital records of all NHS patients sold to insurers - Telegraph
The 274-page report describes the NHS Hospital Episode Statistics as a “valuable data source in developing pricing assumptions for 'critical illness’ cover.” It says that by combining hospital data with socio-economic profiles, experts were able to better calculate the likelihood of conditions, with “amazingly” clear forecasts possible for certain diseases, in particular lung cancer. Phil Booth, from privacy campaign group medConfidential, said: “The language in the document is extraordinary; this isn’t about patients, this is about exploiting a market. Of course any commercial organisation will focus on making a profit – the question is why is the NHS prepared to hand this data over?”
(tags: nhs privacy data insurance uk politics data-protection)
-
'A Monumental Land Art Installation in the Sahara Desert', by the D.A.S.T. Arteam in 1997. More correctly, near the Red Sea resort of El Gouna -- so possible to visit!
(tags: el-gouna sahara deserts land-art art via:colossal desert-breath spirals)
Harvard Research Computing Resources Misused for ‘Dogecoin’ Mining Operation
A member of the Harvard community was stripped of his or her access to the University’s research computing facilities last week after setting up a “dogecoin” mining operation using a Harvard research network, according to an internal email circulated by Faculty of Arts and Sciences Research Computing officials.
(tags: harvard dogecoin bitcoin mining misuse abuse supercomputers)
-
turn Youtube videos into animated GIFs (via Waxy)
(tags: via:waxy gifs youtube video animated-gifs images web)
'Scaling to Millions of Simultaneous Connections' [pdf]
Presentation by Rick Reed of WhatsApp on the large-scale Erlang cluster backing the WhatsApp API, delivered at Erlang Factory SF, March 30 2012. lots of juicy innards here
(tags: erlang scaling scalability performance whatsapp freebsd presentations)
Traffic Graph – Google Transparency Report
this is cool. Google are exposing an aggregated 'all services' hit count time-series graph, broken down by country, as part of their Transparency Report pages
(tags: transparency filtering web google http graphs monitoring syria)
-
I want to emphasize that if you use redis as intended (as a slightly-persistent, not-HA cache), it's great. Unfortunately, more and more shops seem to be thinking that Redis is a full-service database and, as someone who's had to spend an inordinate amount of time maintaining such a setup, it's not. If you're writing software and you're thinking "hey, it would be easy to just put a SET key value in this code and be done," please reconsider. There are lots of great products out there that are better for the overwhelming majority of use cases.
Ouch. (via Aphyr)(tags: redis storage architecture memory caching ha databases)
-
I'm going to need this pretty soon -- lots of white spots showing up with the current BenQ :(
(tags: projectors video home hardware reviews)
Belkin managed to put their firmware update private key in the distribution
'The firmware updates are encrypted using GPG, which is intended to prevent this issue. Unfortunately, Belkin misuses the GPG asymmetric encryption functionality, forcing it to distribute the firmware-signing key within the WeMo firmware image. Most likely, Belkin intended to use the symmetric encryption with a signature and a shared public key ring. Attackers could leverage the current implementation to easily sign firmware images.' Using GPG to sign your firmware updates: yay. Accidentally leaving the private key in the distribution: sad trombone.
(tags: fail wemo belkin firmware embedded-systems security updates distribution gpg crypto public-key pki home-automation ioactive)
-
On-the-fly video transcoding during live streaming. They've done a great job of this!
At the beginning of the development of this feature, we entertained the idea to simply pre-transcode all the videos in Dropbox to all possible target devices. Soon enough we realized that this simple approach would be too expensive at our scale, so we decided to build a system that allows us to trigger a transcoding process only upon user request and cache the results for subsequent fetches. This on-demand approach: adapts to heterogeneous devices and network conditions, is relatively cheap (everything is relative at our scale), guarantees low latency startup time.
(tags: ffmpeg dropbox streaming video cdn ec2 hls http mp4 nginx haproxy aws h264)
GPLv2 being tested in US court
The case is still ongoing, so one to watch.
Plaintiff wrote an XML parser and made it available as open source software under the GPLv2. Defendant acquired from another vendor software that included the code, and allegedly distributed that software to parties outside the organization. According to plaintiff, defendant did not comply with the conditions of the GPL, so plaintiff sued for copyright infringement. Defendants moved to dismiss for failure to state a claim. The court denied the motion.
(tags: gpl open-source licensing software law legal via:fplogue)
Latest Snowden leak: GCHQ spying on Wikileaks users
“How could targeting an entire website’s user base be necessary or proportionate?” says Gus Hosein, executive director of the London-based human rights group Privacy International. “These are innocent people who are turned into suspects based on their reading habits. Surely becoming a target of a state’s intelligence and security apparatus should require more than a mere click on a link.” The agency’s covert targeting of WikiLeaks, Hosein adds, call into question the entire legal rationale underpinning the state’s system of surveillance. “We may be tempted to see GCHQ as a rogue agency, ungoverned in its use of unprecedented powers generated by new technologies,” he says. “But GCHQ’s actions are authorized by [government] ministers. The fact that ministers are ordering the monitoring of political interests of Internet users shows a systemic failure in the rule of law."
(tags: gchq wikileaks snowden privacy spying surveillance politics)
"Hackers" unsubscribed a former Mayor from concerned citizen's emails
"The dog ate my homework, er, I mean, hackers hacked my account."
Former Mayor of Kildare, Cllr. Michael Nolan, has denied a claim he asked a local campaigner to stop e-mailing him. Cllr. Michael Nolan from Newbridge said his site was hacked and wrong e-mails were sent out to a number of people, including Leixlip based campaigner, John Weigel. Mr. Weigel has been campaigning, along with others, about the danger of electromagnetic radiation to humans and the proximity of communications masts to homes and, in particular schools. He regularly updates local politicians on news items relating to the issue. Recently, he said that he had received an e-mail from Cllr. Nolan asking to be removed from Mr. Weigel’s e-mail list. The Leader asked Cllr. Nolan why he had done this. But the Fine Gael councillors said that “his e-mail account was hacked and on one particular day a number of mails a were sent from my account pertaining to be from me.”
(tags: dog-ate-my-homework hackers funny kildare newbridge fine-gael michael-nolan email politics ireland excuses)
-
very good, workable tips on how to remote-work effectively (both in the comments of this thread and the original article)
(tags: tips productivity collaboration hn via:lhl remote-working telecommuting work)
Disgraced Scientist Granted U.S. Patent for Work Found to be Fraudulent - NYTimes.com
Korean researcher Hwang Woo-suk electrified the science world 10 years ago with his claim that he had created the world’s first cloned human embryos and had extracted stem cells from them. But the work was later found to be fraudulent, and Dr. Hwang was fired from his university and convicted of crimes. Despite all that, Dr. Hwang has just been awarded an American patent covering the disputed work, leaving some scientists dumbfounded and providing fodder to critics who say the Patent Office is too lax. “Shocked, that’s all I can say,” said Shoukhrat Mitalipov, a professor at Oregon Health and Science University who appears to have actually accomplished what Dr. Hwang claims to have done. “I thought somebody was kidding, but I guess they were not.” Jeanne F. Loring, a stem cell scientist at the Scripps Research Institute in San Diego, said her first reaction was “You can’t patent something that doesn’t exist.” But, she said, she later realized that “you can.”
(tags: patents absurd hwang-woo-suk cloning stem-cells science biology uspto)
-
'Testing applications under slow or flaky network conditions can be difficult and time consuming. Blockade aims to make that easier. A config file defines a number of docker containers and a command line tool makes introducing controlled network problems simple.' Open-source release from Dell's Cloud Manager team (ex-Enstratius), inspired by aphyr's Jepsen. Simulates packet loss using "tc netem", so no ability to e.g. drop packets on certain flows or certain ports. Still, looks very usable -- great stuff.
(tags: testing docker networking distributed distcomp enstratius jepsen network outages partitions cap via:lusis)
-
what US airports are causing the most misery? Looks like that old favourite, storms in ORD, right now.... (via Theo Schlossnagle)
(tags: via:postwait misery air-travel travel flying ord weather maps)
-
This sounds amazing. I hope it makes it to some kind of "semi-finished".
A semi-roguelike game inspired by Jorge Borges, Umberto Eco, Neal Stephenson, Shadow of the Colossus, Europa Universalis and Civilization. Although currently in its early stages, URR aims to explore several philosophical and sociological issues that both arose during the sixteenth and seventeenth century (when the game is approximately set), and in the present day, whilst almost being a deep, complex and highly challenging roguelike. To do this the game seeks to generate realistic world histories, though ones containing a few unusual happenings and anomalous experiences. The traditional roguelike staple of combat will be rare and deadly – whilst these mechanics will be modeled in detail, exploration, trade and diplomacy factors will have just as much effort put into them.
(tags: games ultima-ratio-regum roguelikes borges umberto-eco worlds ascii-art)
-
It is interesting to note that the fake UK network was the only one detected by Verrimus. However, given that IMSI Catchers operate multiple fake towers simultaneously, it is highly likely that one or more Irish networks were also being intercepted. Very often a misconfiguration, such as an incorrect country code, is the only evidence available of an IMSI Catcher being deployed when forensic tools are not being used to look for one.
(tags: privacy imsi-catchers surveillance bugging spying gsocgate gsoc ireland mobile-phones)
-
An extremely congested local network segment causes the "TCP incast" throughput collapse problem -- packet loss occurs, and TCP throughput collapses as a side effect. So far, this is pretty unsurprising, and anyone designing a service needs to keep bandwidth requirements in mind. However it gets worse with Riak. Due to a bug, this becomes a serious issue for all clients: the Erlang network distribution port buffers fill up in turn, and the Riak KV vnode process (in its entirety) will be descheduled and 'cannot answer any more queries until the A-to-B network link becomes uncongested.' This is where EC2's fully-uncontended-1:1-network compute cluster instances come in handy, btw. ;)
(tags: incast tcp networking bandwidth riak architecture erlang buffering queueing)
Irish Law Society takes a stand for "brand owners IP rights"
The Law Society will attend a meeting of the Oireachtas Health Committee today to outline its strong opposition to the Government proposals to introduce legislation that will require tobacco products to use plain packaging. The society’s director general Ken Murphy will be its principal representative at the meeting today to discuss its submission on the legislation, and to discuss its concerns that a plain packaging regime will undermine registered trade mark, and design, systems and will amount to an “expropriation of brand owners intellectual property rights’. Speaking ahead of the meeting, Mr Murphy told The Irish Times the views contained in it represent those of the Law Society as a whole, and its 10,000 members, and have been endorsed by the society as a whole, rather than the committee. Mr Murphy also said the purpose of the Law Society submission was not to protect the tobacco industry, rather the wider effect and impact such a law would have on intellectual property rights, trade marks, in other areas. “There is a real concern also that plain packaging in the tobacco industry is just the beginning of a trend that will severely undermine intellectual property owners’ rights in other sectors such as alcohol, soft drinks and fast foods.”
Judging by some reactions on Twitter, "endorsed by the society as a whole" may be over-egging it a little.(tags: law-society gubu law ireland ip packaging branding trademarks cigarettes health tobacco)
British American Tobacco - Plain packaging of tobacco products
Compare and contrast with the Law Society's comments:
We believe we are entitled to use our packs to distinguish our products from those of our competitors. Our brands are our intellectual property which we have created and invested in. Plain packaging would deny us the right to use brands. But also, a brand is also an important tool for consumers. As the British Brands Group has stated , plain packaging legislation "ignores the crucial role that branding plays in providing consumers with high quality, consistent products they can trust". The restriction of valuable corporate brands by any government would risk placing it in breach of legal obligations relating to intellectual property rights and, in most cases, international trade.
(tags: law-society branding ip ireland tobacco cigarettes law trademarks)
Why dispute resolution is hard
Good stuff (as usual) from Ross Anderson and Stephen Murdoch. 'Today we release a paper on security protocols and evidence which analyses why dispute resolution mechanisms in electronic systems often don’t work very well. On this blog we’ve noted many many problems with EMV (Chip and PIN), as well as other systems from curfew tags to digital tachographs. Time and again we find that electronic systems are truly awful for courts to deal with. Why? The main reason, we observed, is that their dispute resolution aspects were never properly designed, built and tested. The firms that delivered the main production systems assumed, or hoped, that because some audit data were available, lawyers would be able to use them somehow. As you’d expect, all sorts of things go wrong. We derive some principles, and show how these are also violated by new systems ranging from phone banking through overlay payments to Bitcoin. We also propose some enhancements to the EMV protocol which would make it easier to resolve disputes over Chip and PIN transactions.'
(tags: finance security ross-anderson emv bitcoin chip-and-pin banking architecture verification vvat logging)
CJEU in #Svensson says that in general it is OK to hyperlink to protected works without permission
IPKat says 'this morning the Court of Justice of the European Union issued its keenly awaited decision in Case C-466/12 Svensson [...]: The owner of a website may, without the authorisation of the copyright holders, redirect internet users, via hyperlinks, to protected works available on a freely accessible basis on another site. This is so even if the internet users who click on the link have the impression that the work is appearing on the site that contains the link.' This is potentially big news. Not so much for the torrent-site scenario, but for the NNI/NLI linking-to-newspaper-stories scenario.
(tags: ip svensson cjeu eu law pirate-bay internet web links http copyright linking hyperlinks)
Migrating from MongoDB to Cassandra
Interesting side-effect of using LUKS for full-disk encryption: 'For every disk read, we were pulling in 3MB of data (RA is sectors, SSZ is sector size, 6144*512=3145728 bytes) into cache. Oops. Not only were we doing tons of extra work, but we were trashing our page cache too. The default for the device-mapper used by LUKS under Ubuntu 12.04LTS is incredibly sub-optimal for database usage, especially our usage of Cassandra (more small random reads vs. large rows). We turned this down to 128 sectors — 64KB.'
(tags: cassandra luks raid linux tuning ops blockdev disks sdd)
-
Good to see the guys cracking on without me ;) '2014-02-11: SpamAssassin 3.4.0 has been released adding native support for IPv6, improved DNS Blocklist technology and support for massively-scalable Bayesian filtering using the Redis backend.'
(tags: antispam open-source spamassassin apache)
193_Cellxion_Brochure_UGX Series 330
The Cellxion UGX Series 330 is a 'transportable Dual GSM/Triple UMTS Firewall and Analysis Tool' -- ie. an IMSI catcher in a briefcase, capable of catching IMSI/IMEIs in 3G. It even supports configurable signal strength. Made in the UK
(tags: cellxion imsi-catchers imei surveillance gsocgate gsm 3g mobile-phones security spying)
-
'an interesting approach to a common problem, that of securely passing secrets around an infrastructure. It uses GPG signed files under the hood and nicely integrates with both version control systems and S3.' I like this as an approach to securely distributing secrets across a stack of services during deployment. Check in the file of keys, gpg keygen on the server, and add it to the keyfile's ACL during deployment. To simplify, shared or pre-generated GPG keys could also be used. (via the Devops Weekly newsletter)
(tags: gpg encryption crypto secrets key-distribution pki devops deployment)
java - Why not use Double or Float to represent currency?
A good canonical URL for this piece of coding guidance.
For example, suppose you have $1.03 and you spend 42c. How much money do you have left? System.out.println(1.03 - .42); => prints out 0.6100000000000001.
(tags: coding tips floating-point float java money currency bugs)
"I'm Sorry for what I said when I was Hungry" tee-shirt
I can relate to this
(tags: tee-shirts apparel etsy hangry)
-
'One case involved Julian Assange's current home at the Ecuadorian Embassy in London, where visitors were surprised to receive welcome messages from a Ugandan telephone company. It turned out the messages were coming from a foreign base station device installed on the roof, masquerading as a cell tower for surveillance purposes. Appelbaum suspects the GCHQ simply forgot to reformat the device from an earlier Ugandan operation.'
via T.J. McIntyre.(tags: surveillance nsa privacy imsi-catchers gchq london uganda mobile-phones julian-assange ecuador embassies)
The Spyware That Enables Mobile-Phone Snooping - Bloomberg
More background on IMSI catchers -- looking likely to have been the "government-level technology" used to snoop on the Garda Ombudsman's offices, particularly given the 'detection of an unexpected UK 3G network near the GSOC offices':
The technology involved is called cellular interception. The active variety of this, the “IMSI catcher,” is a portable device that masquerades as a mobile phone tower. Any phone within range (a mile for a low-grade IMSI catcher; as much as 100 miles for a passive interception device with a very large antenna, such as those used in India) automatically checks to see if the device is a tower operated by its carrier, and the false “tower” indicates that it is. It then logs the phone’s International Mobile Subscriber Identity number -- and begins listening in on its calls, texts and data communications. No assistance from any wireless carrier is needed; the phone has been tricked. [...] “network extender” devices -- personal mobile-phone towers -- sold by the carriers themselves, often called femtocells, can be turned into IMSI catchers.
Via T.J. McIntyre(tags: via:tjmcintyre imsi-catchers surveillance privacy gsocgate mobile-phones spying imsi)
Git is not scalable with too many refs/*
Mailing list thread from 2011; git starts to keel over if you tag too much
Survey results of EU teens using the internet
A lot of unsupervised use:
Just under half of children said they access the internet from their own bedroom on a daily basis with 22pc saying they do so several times a day.
(tags: surveys eu ireland politics filtering internet social-media facebook children teens cyber-bullying)
-
a pretty thought-provoking article from Linux Journal on women in computing, and how we're doing it all wrong
(tags: feminism community programming coding women computing software society work linux-journal children teaching)
-
leading Bitcoin exchange "Magic The Gatherine Online Exchange" turns out to suffer from crappy code, surprise:
why does Mt. Gox experience this issue? They run a custom Bitcoin daemon, with a custom implementation of the Bitcoin protocol. Their implementation, against all advice, does rely on the transaction ID, which makes this attack possible. They have actually been warned about it months ago by gmaxwell, and have apparently decided to ignore this warning. In other words, this is not a vulnerability in the Bitcoin protocol, but an implementation error in Mt. Gox' custom Bitcoin software.
The rest of the article is eyeopening, including the MySQL injection vulnerabilities and failure to correctly secure a Prolexic-defended server. https://news.ycombinator.com/item?id=7211286 has some other shocking reports of Bitcoin operators being incompetent, including 'Bitomat, the incompetent exchange that deleted their own [sole] amazon instance accidentally which contained all their keys, and thus customer funds'. wtfbbq(tags: mtgox security bitcoin standards omgwtfbbq via:hn bitomat)
-
The side-effects of algorithmic false-positives get worse and worse.
What’s more, he adds, the NSA often locates drone targets by analyzing the activity of a SIM card, rather than the actual content of the calls. Based on his experience, he has come to believe that the drone program amounts to little more than death by unreliable metadata. “People get hung up that there’s a targeted list of people,” he says. “It’s really like we’re targeting a cell phone. We’re not going after people – we’re going after their phones, in the hopes that the person on the other end of that missile is the bad guy.”
(tags: false-positives glenn-greenwald drones nsa death-by-metadata us-politics terrorism sim-cards phones mobile-phones)
IBM's creepy AI cyberstalking plans
'let's say that you tweet that you've gotten a job offer to move to San Francisco. Using IBM's linguistic analysis technologies, your bank would analyze your Twitter feed and not only tailor services it could offer you ahead of the move--for example, helping you move your account to another branch, or offering you a loan for a new house -- but also judge your psychological profile based upon the tone of your messages about the move, giving advice to your bank's representatives about the best way to contact you.'
Ugh. Here's hoping they've patented this shit so we don't actually have to suffer through it. Creeeepy. (via Adam Shostack)(tags: datamining ai ibm stupid-ideas creepy stalking twitter via:adamshostack)
-
This is bananas. Confirmation bias running amok.
Brandon Mayfield was a US Army veteran and an attorney in Portland, OR. After the 2004 Madrid train bombing, his fingerprint was partially matched to one belonging to one of the suspected bombers, but the match was a poor one. But by this point, the FBI was already convinced they had their man, so they rationalized away the non-matching elements of the print, and set in motion a train of events that led to Mayfield being jailed without charge; his home and office burgled by the FBI; his client-attorney privilege violated; his life upended.
(tags: confirmation-bias bias law brandon-mayfield terrorism fingerprints false-positives fbi scary)
A patent on 'Birth of a Child By Centrifugal Force'
On November 9 1965, the Blonskys were granted US Patent 3,216,423, for an Apparatus for Facilitating the Birth of a Child by Centrifugal Force. The drawings, as well as the text, are a revelation. The Patent Office has them online at http://tinyurl.com/jd4ra and I urge you - if you have any shred of curiosity in your body - to look them up. For conceiving what appears to be the greatest labour-saving device ever invented, George and Charlotte Blonsky won the 1999 Ig Nobel Prize in the field of Managed Health Care.
This is utterly bananas. (via christ)(tags: via:christ crazy patents 1960s centrifuge birth medicine ignobels)
A Linguist Explains the Grammar of Doge. Wow.
In this sense, doge really is the next generation of LOLcat, in terms of a pet-based snapshot of a certain era in internet language. We’ve kept the idea that animals speak like an exaggerated version of an internet-savvy human, but as our definitions of what it means to be a human on the internet have changed, so too have the voices that we give our animals. Wow.
(tags: via:nelson language linguist doge memes internet english)
Big, Small, Hot or Cold - Your Data Needs a Robust Pipeline
'(Examples [of big-data B-I crunching pipelines] from Stripe, Tapad, Etsy & Square)'
(tags: stripe tapad etsy square big-data analytics kafka impala hadoop hdfs parquet thrift)
Realtime water level data across Ireland
Some very nice Dygraph-based time-series graphs in here, along with open CSV data. Good job!
(tags: open-data water-levels time-series data rivers ireland csv)
The Gardai haven't requested info on any Twitter accounts in the past 6 months
This seems to imply they haven't been investigating any allegations of cyber-bullying/harassment from "anonymous" Twitter handles, despite having the legal standing to do so. Enforcement is needed, not new laws
(tags: cyber-bullying twitter social-media enforcement gardai policing harassment online society law government)
QuakeNet IRC Network- Article - PRESS RELEASE: IRC NETWORKS UNDER SYSTEMATIC ATTACK FROM GOVERNMENTS
QuakeNet are not happy about GCHQ's DDoS attacks against them.
Yesterday we learned ... that GCHQ, the British intelligence agency, are performing persistent social and technological attacks against IRC networks. These attacks are performed without informing the networks and are targeted at users associated with politically motivated movements such as "Anonymous". While QuakeNet does not condone or endorse and actively forbids any illegal activity on its servers we encourage discussion on all topics including political and social commentary. It is apparent now that engaging in such topics with an opinion contrary to that of the intelligence agencies is sufficient to make people a target for monitoring, coercion and denial of access to communications platforms. The ... documents depict GCHQ operatives engaging in social engineering of IRC users to entrap themselves by encouraging the target to leak details about their location as well as wholesale attacks on the IRC servers hosting the network. These attacks bring down the IRC network entirely affecting every user on the network as well as the company hosting the server. The collateral damage and numbers of innocent people and companies affected by these forms of attack can be huge and it is highly illegal in many jurisdictions including the UK under the Computer Misuse Act.
-
Good to know; this generic anti-flap damping algorithm has a name.
A proportional-integral-derivative controller (PID controller) is a generic control loop feedback mechanism (controller) widely used in industrial control systems. A PID controller calculates an "error" value as the difference between a measured process variable and a desired setpoint. The controller attempts to minimize the error by adjusting the process control outputs.
(tags: control damping flapping pid-controller industrial error algorithms)
German IT Industry Looks for Boom from Snowden Revelations - SPIEGEL ONLINE
This is a great idea -- Neelie Kroes suggesting that there be a certification mark for EU companies who have top-of-the-line data protection practices.
(tags: data-protection privacy certification marks eu neelie-kroes)
GCHQ slide claiming that they DDoS'd anonymous' IRC servers
Mikko Hypponen: "This makes British Government the only Western government known to have launched DDoS attacks."
(tags: ddos history security gchq dos anonymous irc hacking)
RTE internal memo to unhappy staff re Pantigate
'I want to reassure you that RTÉ explored every option available to it, including right of reply. Legal advice was sought and all avenues were explored, including an offer to make a donation to a neutral charity.' And they folded. Notable lack of testicular fortitude by our national broadcaster.
(tags: fail rte leaks memos pantigate panti-bliss homophobia libel defamation ireland)
A looming breakthrough in indistinguishability obfuscation
'The team’s obfuscator works by transforming a computer program into what Sahai calls a “multilinear jigsaw puzzle.” Each piece of the program gets obfuscated by mixing in random elements that are carefully chosen so that if you run the garbled program in the intended way, the randomness cancels out and the pieces fit together to compute the correct output. But if you try to do anything else with the program, the randomness makes each individual puzzle piece look meaningless. This obfuscation scheme is unbreakable, the team showed, provided that a certain newfangled problem about lattices is as hard to solve as the team thinks it is. Time will tell if this assumption is warranted, but the scheme has already resisted several attempts to crack it, and Sahai, Barak and Garg, together with Yael Tauman Kalai of Microsoft Research New England and Omer Paneth of Boston University, have proved that the most natural types of attacks on the system are guaranteed to fail. And the hard lattice problem, though new, is closely related to a family of hard problems that have stood up to testing and are used in practical encryption schemes.' (via Tony Finch)
(tags: obfuscation cryptography via:fanf security hard-lattice-problem crypto science)
Little’s Law, Scalability and Fault Tolerance: The OS is your bottleneck. What you can do?
good blog post on Little's Law, plugging quasar, pulsar, and comsat, 3 new open-source libs offering Erlang-like lightweight threads on the JVM
(tags: jvm java quasar pulsar comsat littles-law scalability async erlang)
Target Hackers Broke in Via HVAC Company
Avivah Litan, a fraud analyst with Gartner Inc., said that although the current PCI standard does not require organizations to maintain separate networks for payment and non-payment operations (page 7), it does require merchants to incorporate two-factor authentication for remote network access originating from outside the network by personnel and all third parties.
Target shared the same network for outside contractor access and the critical POS devices. fail. (via Joe Feise)(tags: via:joe-feise hvac contractors fraud malware 2fa security networking payment pci)
Yahoo! moving EMEA operations to Dublin
Like many companies, the structure of Yahoo's business is driven by the needs of the business. There are a number of factors which influence decisions about the locations in which the business operates. To encourage more collaboration and innovation, we’re increasing our headcount in Dublin, thus continuing to bring more Yahoos together in fewer locations. Dublin is already the European home to many of the world’s leading global technology brands and has been a home for Yahoo for over a decade already.
Via Conor O'Neill
-
zero-install, one-click video chat, using WebRTC. nifty
(tags: conference webrtc chat collaboration video google-chrome conferencing)
Opinion: How can we get over ‘Pantigate’?
The fact that RTÉ had agreed to pay damages (€80,000 in total, according to reports yesterday) to the ‘injured parties’, only came to light in an email from the [far-right Catholic lobby group Iona Institute] to its members last Tuesday. Given the ramifications of the decision to make any kind of payment – regardless of the amount – both for the TV licence payer and those who voice contrarian opinions, the lack of coverage in print media as soon as the Iona email came to light marked a low point for print journalism in Ireland. Aside from a lead story on the damages printed in this paper last Wednesday and ongoing debate online, the media has been glacially slow with commentary and even reportage of the affair. The debacle has untold ramifications for public life in this country. That many liberal commentators may now baulk at the opportunity to speak and write openly and honestly about homophobia is the most obvious issue here. Most worrying of all, however, is the question that with a referendum on the introduction of gay marriage on the horizon, how can we expect the national broadcaster to facilitate even-handed debate on the subject when they’ve already found themselves cowed before reaching the first hurdle?
(tags: homophobia politics ireland libel dissent lobbying defamation law gay-marriage iona-institute journalism newspapers)
-
Rest.li is a REST+JSON framework for building robust, scalable service architectures using dynamic discovery and simple asynchronous APIs. Rest.li fills a niche for building RESTful service architectures at scale, offering a developer workflow for defining data and REST APIs that promotes uniform interfaces, consistent data modeling, type-safety, and compatibility checked API evolution.
The new underlying comms layer for Voldemort, it seems.(tags: voldemort d2 rest.li linkedin json rest http api frameworks java)
Hardened SSL Ciphers Using ELB and HAProxy
ELBs support the PROXY protocol
(tags: elb security proxying ssl tls https haproxy perfect-forward-secrecy aws ec2)
-
"A data scientist is a statistician who lives in San Francisco" - slide from Monkigras this year. lols
(tags: data-scientist statistics statistician funny jokes san-francisco tech monkigras)
The Million Dollar Deal - YouTube
My mate Luke's doc on the World Series of Poker -- now online in full. it's great.
A documentary about the World Series Of Poker in Las Vegas. Featuring Andrew Black, Donnacha O'Dea, Mike Magee, "Mad" Martyn Wilson, Mark Napolitano, Amarillo Slim, Scotty Nguyen, Dave "Devilfish" Ulliott & Matt Damon. Narrated by John Hurt. Directed by John Butler, Produced by Luke McManus
(tags: documentaries film poker world-series-of-poker mike-magee andrew-black donnacha-odea matt-damon)
How to invoke section 4 of the Data Protection Acts in Ireland
One wierd trick to get your personal data (in any format) from any random organisation, for only EUR6.35 and up to 40 days wait! Good to know.
Hospitals and doctors’ offices in Ireland will give a person their medical records if they ask for them. Mostly. Eventually. When they get to it. And, sometimes, if you pay them over €100 (for a large file). But, like so much else in the legal world, there is a set of magic words you can incant to place a 40 day deadline on the delivery of your papers and limit the cost to €6.35 -- you invoke the Data Protection Acts data access request procedure.
(tags: data-protection privacy data-retention dpa-section-4 data ireland medical law dpa)
Save 10% on rymdkapsel on Steam
rymdkapsel is a game where you take command of a space station and its minions. You will have to plan your expansion and manage your resources to explore the galaxy.
recommended by JK.(tags: steam games recommended space gaming)
Yammer Engineering - Resiliency at Yammer
Not content with adding Hystrix (circuit breakers, threadpooling, request time limiting, metrics, etc.) to their entire SOA stack, they've made it incredibly configurable by hooking in a web-based configuration UI, allowing dynamic on-the-fly reconfiguration by their ops guys of the circuit breakers and threadpools in production. Mad stuff
(tags: hystrix circuit-breakers resiliency yammer ops threadpools soa dynamic-configuration archaius netflix)
A network of ‘homes’, where children’s happiness was relentlessly destroyed
Stories of this sort will tumble out to the inquiry over the next 18 months, making it plain that the network of “homes” where children’s happiness had relentlessly, deliberately, systematically been destroyed, this archipelago of Catholic evil, had covered the entire island. These things should be kept in mind when next we hear it said that the social ills of today can be explained by reference to loss of faith in the traditional institutions of moral authority. This is the reverse of the truth and an insult to the victims of an unforgiveable sin.
(tags: horror care-homes politics catholicism religion ireland derry church abuse children)
Ukrainian police use cellphones to track protestors, court order shows
Protesters for weeks had suspected that the government was using location data from cellphones near the demonstration to pinpoint people for political profiling, and they received alarming confirmation when a court formally ordered a telephone company to hand over such data. [...] Three cellphone companies — Kyivstar, MTS and Life — denied that they had provided the location data to the government or had sent the text messages. Kyivstar suggested that it was instead the work of a “pirate” cellphone tower set up in the area. In a ruling made public on Wednesday, a city court ordered Kyivstar to disclose to the police which cellphones were turned on during an antigovernment protest outside the courthouse on Jan. 10.
(tags: tech location-tracking tracking privacy ukraine cellphones mobile-phones civil-liberties)
-
Netflix open-source library to make using ZooKeeper from Java less of a PITA. I really wish I'd used this now, having reimplemented some key parts of it after failures in prod ;)
(tags: zookeeper netflix apache curator java libraries open-source)
10 Things We Forgot to Monitor
a list of not-so-common outage causes which are easy to overlook; swap rate, NTP drift, SSL expiration, fork rate, etc.
Irish Company Locates Office in Ireland
Hot on the heels of Dropbox, AirBnB, Twitter, Facebook and many others, Irish online ticket sales company Tito are amongst the latest in a long series of companies choosing to locate their offices in Ireland. “It just seemed to make sense,” said founder Paul Campbell, talking about the decision making process that led him to set up shop in the capital, Dublin. “Dublin is great. There’s something really familiar about it that I can’t quite put my finger on.”
Har har!(tags: ireland jokes funny tito hq tech-companies dublin via:oisin)
-
Sugru + neodymium magnets = WANT
(tags: sugru diy tools magnets want toget bike hacks fixing)
Capabilities of Movements and Affordances of Digital Media: Paradoxes of Empowerment | DMLcentral
Paradoxically, it’s possible that the widespread use of digital tools facilitates capabilities in some domains, such as organization, logistics, and publicity, while simultaneously engendering hindrances to [political] movement impacts on other domains, including those related to policy and electoral spheres.
(tags: society politics activism tech internet gezi-park tahrir-square euromaidan occupy)
-
Good description of the "hero coder" organisational antipattern.
Now imagine that most of the team is involved in fire-fighting. New recruits see the older recruits getting praised for their brave work in the line-of-fire and they want that kind of praise and reward too. Before long everyone is focused on putting out fires and it is no ones interest to step back and take on the risks that long-term DevOps-focused goals entail.
(tags: coding ops admin hero-coder hero-culture firefighting organisations teams culture)
Open-Sourcing Ssync: An Out-of-the-Box Distributed Rsync
a script to perform divide-and-conquer recursive rsync over SSH
(tags: recursion scripts rsync ssync ssh divide-and-conquer)
Improving compaction in Cassandra with cardinality estimation
nice use of HyperLogLog
(tags: hyperloglog hll algorithms cassandra bloom-filters sstables cardinality)
-
Ad company InMobi are using graphite heavily (albeit not as heavily as $work are), ran into the usual scaling issues, and chose to fix it in code by switching from a filesystem full of whisper files to a LevelDB per carbon-cache:
The carbon server is now able to run without breaking a sweat even when 500K metrics per minute is being pumped into it. This has been in production since late August 2013 in every datacenter that we operate from.
Very nice. I hope this gets merged/supported.(tags: graphite scalability metrics leveldb storage inmobi whisper carbon open-source)
BBC News - Pair jailed over abusive tweets to feminist campaigner
When a producer from BBC Two's Newsnight programme tracked Nimmo down after he had sent the abuse, the former call centre worker told him: "The police will do nothing, it's only Twitter."
(tags: bbc bullying social-media twitter society uk trolls trolling abuse feminism cyberbullying)
If You Used This Secure Webmail Site, the FBI Has Your Inbox
TorMail was a Tor-based webmail system, and apparently its drives have been imaged and seized by the FBI. More info on the Freedom Hosting seizure:
The connection, if any, between the FBI obtaining Freedom Hosting’s data and apparently launching the malware campaign through TorMail and the other sites isn’t spelled out in the new document. The bureau could have had the cooperation of the French hosting company that Marques leased his servers from. Or it might have set up its own Tor hidden services using the private keys obtained from the seizure, which would allow it to adopt the same .onion addresses used by the original sites. The French company also hasn’t been identified. But France’s largest hosting company, OVH, announced on July 29, in the middle of the FBI’s then-secret Freedom Hosting seizure, that it would no longer allow Tor software on its servers. A spokesman for the company says he can’t comment on specific cases, and declined to say whether Freedom Hosting was a customer. “Wherever the data center is located, we conduct our activities in conformity with applicable laws, and as a hosting company, we obey search warrants or disclosure orders,” OVH spokesman Benjamin Bongoat told WIRED. “This is all we can say as we usually don’t make any comments on hot topics.”
(tags: fbi freedom-hosting hosting tor tormail seizures ovh colo servers)
Sky parental controls break many JQuery-using websites
An 11 hour outage caused by a false positive in Sky's anti-phishing filter; all sites using the code.jquery.com CDN for JQuery would have seen errors.
Sky still appears to be blocking code.jquery.com and all files served via the site, and more worryingly is that if you try to report the incorrect category, once signing in on the Sky website you an error page. We suspect the site was blocked due to being linked to by a properly malicious website, i.e. code.jquery.com and some javascript files were being used on a dodgy website and every domain mentioned was subsequently added to a block list.
(via Tony Finch)(tags: via:fanf sky filtering internet uk anti-phishing phish jquery javascript http web fps false-positives)
Coders performing code reviews of scientific projects: pilot study
'PLOS and Mozilla conducted a month-long pilot study in which professional developers performed code reviews on software associated with papers published in PLOS Computational Biology. While the developers felt the reviews were limited by (a) lack of familiarity with the domain and (b) lack of two-way contact with authors, the scientists appreciated the reviews, and both sides were enthusiastic about repeating the experiment. ' Actually sounds like it was more successful than this summary implies.
(tags: plos mozilla code-reviews coding science computational-biology biology studies)
-
The views expressed by [the Iona Institute] – especially in relation to gay people – are very much at odds with the liberal secular society that Ireland has become. Indeed, Rory O’Neill suggested that the only time he experiences homophobia is online or at the hands of Iona and Waters. When they’re done with that, they can ask why Iona is given so much room in the media. In any other country in the world, an organisation as litigious as Iona would never be asked to participate in anything.
(tags: homophobia ireland john-waters iona-institute politics catholicism religion libel defamation rte the-irish-times)
-
mine's a Smoky/Spicy/Medicinal, thanks
Cassandra: tuning the JVM for read heavy workloads
The cluster we tuned is hosted on AWS and is comprised of 6 hi1.4xlarge EC2 instances, with 2 1TB SSDs raided together in a raid 0 configuration. The cluster’s dataset is growing steadily. At the time of this writing, our dataset is 341GB, up from less than 200GB a few months ago, and is growing by 2-3GB per day. The workload on this cluster is very read heavy, with quorum reads making up 99% of all operations.
Some careful GC tuning here. Probably not applicable to anyone else, but good approach in general.(tags: java performance jvm scaling gc tuning cassandra ops)
Terms of Reference for the DCENR Internet Content Advisory Group
this is definitely one to send a consultation document response to
(tags: internet policing cyberbullying bullying antisocial free-speech governance children blocking filtering consultations dcenr)
Stupid Simple Things SF Techies Could Do To Stop Being Hated - Anil Dash
I've seen a lot of hand-wringing from techies in San Francisco and Silicon Valley saying "Why are we so hated?" now that there's been a more vocal contingent of people being critical of their lack of civic responsibility. Is it true that corruption and NIMBYism have kept affordable housing from being built? Sure. Is it true that members of the tech industry do contribute tax dollars to the city? Absolutely. But does that mean techies have done enough? Nope.
(tags: anil-dash politics society san-francisco gentrification helping tech community housing)
-
Some basic succinct data structures. [...] The main highlights are: a novel, broadword-based implementation of rank/select queries for up to 264 bits that is highly competitive with known 32-bit implementations on 64-bit architectures (additional space required is 25% for ranking and 12.5%-37.5% for selection); several Java structures using the Elias–Fano representation of monotone sequences for storing pointers, variable-length bit arrays, etc. Java code implementing minimal perfect hashing using around 2.68 bits per element (also using some broadword ideas); a few Java implementations of monotone minimal perfect hashing. Sux is free software distributed under the GNU Lesser General Public License.
(tags: sux succinct data-structures bits compression space coding)
Why sugar helped remove Victoria Line concrete flood
Sugar blocks concrete from setting. This I did not know
(tags: concrete london tube flooding sugar chemistry factoids)
Ukrainian government targeting protesters using threatening SMS messages
The government’s opponents said three recent actions had been intended to incite the more radical protesters and sow doubt in the minds of moderates: the passing of laws last week circumscribing the right of public assembly, the blocking of a protest march past the Parliament building on Sunday, and the sending of cellphone messages on Tuesday to people standing in the vicinity of the fighting that said, “Dear subscriber, you are registered as a participant in a mass disturbance.” [....] The phrasing of the message, about participating in a “mass disturbance,” echoed language in a new law making it a crime to participate in a protest deemed violent. The law took effect on Tuesday. And protesters were concerned that the government seemed to be using cutting-edge technology from the advertising industry to pinpoint people for political profiling. Three cellphone companies in Ukraine — Kyivstar, MTS and Life — denied that they had provided the location data to the government or had sent the text messages, the newspaper Ukrainskaya Pravda reported. Kyivstar suggested that it was instead the work of a “pirate” cellphone tower set up in the area.
(tags: targeting mobile-phones sms text-messaging via:tjmcintyre geotargeting protest ukraine privacy surveillance tech 1984)
UK porn filter blocks game update that contained 'sex' in URL
Staggeringly inept. The UK national porn filter blocks based on a regexp match of the URL against /.*sex.*/i -- the good old "Scunthorpe problem". Better, it returns a 404 response. This is also a good demonstration of how web filtering has unintended side effects, breaking third-party software updates with its false positives.
The update to online strategy game League of Legends was disrupted by the internet filter because the software attempted to access files that accidentally include the word “sex” in the middle of their file names. The block resulted in the update failing with “file not found” errors, which are usually created by missing files or broken updates on the part of the developers.
(tags: uk porn filtering guardian regular-expressions false-positives scunthorpe http web league-of-legends sex)
Register article on Amazon's attitude to open source
This article is frequently on target; this secrecy (both around open source and publishing papers) was one of the reasons I left Amazon.
Of the sources with whom we spoke, many indicated that Amazon's lack of participation was a key reason for why people left the company – or never joined at all. This is why Amazon's strategy of maintaining secrecy may derail the e-retailer's future if it struggles to hire the best talent. [...] "In many cases in the big companies and all the small startups, your Github profile is your resume," explained another former Amazonian. "When I look at developers that's what I'm looking for, [but] they go to Amazon and that resume stops ... It absolutely affects the quality of their hires." "You had no portfolio you could share with the world," said another insider on life after working at Amazon. "The argument this was necessary to attract talent and to retain talent completely fell on deaf ears."
(tags: amazon recruitment secrecy open-source hiring work research conferences)
Chinese Internet Traffic Redirected to Small Wyoming House
'That address — which is home to some 2,000 companies on paper — was the subject of a lengthy 2011 Reuters investigation that found that among the entities registered to the address were a shell company controlled by a jailed former Ukraine prime minister; the owner of a company charged with helping online poker operators evade an Internet gambling ban; and one entity that was banned from government contracts after selling counterfeit truck parts to the Pentagon.'
(tags: china internet great-firewall dns wyoming attacks security not-the-onion)
James Friend | PCE.js - Classic Mac OS in the Browser
This is a demo of PCE's classic Macintosh emulation, running System 7.0.1 with MacPaint, MacDraw, and Kid Pix. If you want to try out more apps and games see this demo.
Incredible. I remember using this version of MacPaint!(tags: javascript browser emulation mac macos macpaint macdraw claris kid-pix history desktop pce)
-
'Lightweight performance tools'.
Likwid stands for 'Like I knew what I am doing'. This project contributes easy to use command line tools for Linux to support programmers in developing high performance multi-threaded programs. It contains the following tools: likwid-topology: Show the thread and cache topology likwid-perfctr: Measure hardware performance counters on Intel and AMD processors likwid-features: Show and Toggle hardware prefetch control bits on Intel Core 2 processors likwid-pin: Pin your threaded application without touching your code (supports pthreads, Intel OpenMP and gcc OpenMP) likwid-bench: Benchmarking framework allowing rapid prototyping of threaded assembly kernels likwid-mpirun: Script enabling simple and flexible pinning of MPI and MPI/threaded hybrid applications likwid-perfscope: Frontend for likwid-perfctr timeline mode. Allows live plotting of performance metrics. likwid-powermeter: Tool for accessing RAPL counters and query Turbo mode steps on Intel processor. likwid-memsweeper: Tool to cleanup ccNUMA memory domains.
No kernel patching required. (via kellabyte)(tags: via:kellabyte linux performance testing perf likwid threading multithreading multicore mpi numa)
Backblaze Blog » What Hard Drive Should I Buy?
Because Backblaze has a history of openness, many readers expected more details in my previous posts. They asked what drive models work best and which last the longest. Given our experience with over 25,000 drives, they asked which ones are good enough that we would buy them again. In this post, I’ll answer those questions.
(tags: backblaze backup hardware hdds storage disks ops via:fanf)
Safe cross-thread publication of a non-final variable in the JVM
Scary, but potentially useful in future, so worth bookmarking. By carefully orchestrating memory accesses using volatile and non-volatile fields, one can ensure that a non-volatile, non-synchronized field's value is safely visible to all threads after that point due to JMM barrier semantics.
What you are looking to do is enforce a barrier between your initializing stores and your publishing store, without that publishing store being made to a volatile field. This can be done by using volatile access to other fields in the publication path, without using those variables in the later access paths to the published object.
(tags: volatile atomic java jvm gil-tene synchronization performance threading jmm memory-barriers)
Irish quango allegedly buys fake twitter followers
The Consumers Association of Ireland had a sudden jump from 300 to 3000 Twitter followers, mostly from Latin and South America -- with more followers in Brazil than Ireland. They are now blaming "hacking": http://www.independent.ie/irish-news/consumers-body-denies-buying-3000-twitter-fans-29931196.html
(tags: consumers quangos ireland politics twitter funny fake-followers latin-america south-america brazil social-media tech)
Big Red Kitchen on buying Irish honey
1. There is NO SUCH THING as "Organic Irish Honey" (due to EU directives making it impossible to certify); 2. In the absence of Organic the best thing you can look for is "Raw Irish honey" (which is of Irish origin, and not heated to very high temperatures, so it retains its antibacterial properties); 3. Blended honeys, or honeys which say EEC/Non EEC are NOT Irish, however they may be packed in Ireland; 4. Look for the NIHBS "Produced by Native Irish Honey Bees" or similar, for confirmation that the honey you are buying is indeed of Irish origin.
(tags: irish ireland honey buy-irish big-red-kitchen food organic-food)
More than 50% of Irish companies have "suffered a data breach" in 2013
The research, conducted among hundreds of Irish companies' IT managers by the Irish Computer Society, reveals that 51 per cent of Irish firms have suffered a data breach over the last year, a jump on 43 per cent recorded in 2012.
Wow, that's high.(tags: hacking security ireland ics data-breaches)
Irish Internet Providers Roll Out KickassTorrents Blockade
The lucrative whack-a-mole business continues -- mostly in response to High Court actions, although Eircom are just helping out. I bet a google for "kickass proxy" doesn't return anything useful at all, of course....
(tags: kat kickasstorrents bittorrent piracy copyright high-court ireland eircom filtering blocking)
Internet Censors Came For TorrentFreak & Now I’m Really Mad
TF are not happy about Sky blocking their blog.
There can be little doubt that little by little, piece by piece, big corporations and governments are taking chunks out of the free Internet. Today they pretend that the control is in the hands of the people, but along the way they are prepared to mislead and misdirect, even when their errors are pointed out to them. I’m calling on Sky, Symantec, McAfee and other ISPs about to employ filtering to categorize this site correctly as a news site or blog and to please start listening to people’s legitimate complaints about other innocent sites. It serves nobody’s interests to wrongfully block legitimate information.
(tags: censorship isps uk sky torrentfreak piracy copyright filtering blocking symantec filesharing)
Harry - A Tool for Measuring String Similarity
a small tool for comparing strings and measuring their similarity. The tool supports several common distance and kernel functions for strings as well as some exotic similarity measures. The focus of Harry lies on implicit similarity measures, that is, comparison functions that do not give rise to an explicit vector space. Examples of such similarity measures are the Levenshtein distance and the Jaro-Winkler distance. For comparison Harry loads a set of strings from input, computes the specified similarity measure and writes a matrix of similarity values to output. The similarity measure can be computed based on the granularity of characters as well as words contained in the strings. The configuration of this process, such as the input format, the similarity measure and the output format, are specified in a configuration file and can be additionally refined using command-line options. Harry is implemented using OpenMP, such that the computation time for a set of strings scales linear with the number of available CPU cores. Moreover, efficient implementations of several similarity measures, effective caching of similarity values and low-overhead locking further speedup the computation.
via kragen.(tags: via:kragen strings similarity levenshtein-distance algorithms openmp jaro-winkler edit-distance cli commandline hamming-distance compression)
-
A nice node.js app to perform continuous deployment from a GitHub repo via its webhook support, from Matt Sergeant
(tags: github node.js runit deployment git continuous-deployment devops ops)
-
yummy-looking recipe from Lily at amexicancook.ie
(tags: tacos mexican-food food recipes meat tacos-al-pastor)
Succinct Data Structures: Cramming 80,000 words into a Javascript file
a succinctly-encoded trie -- slow to encode, super-compact, but fast to look up
(tags: succinct-encoding tries coding performance compression data-structures algorithms)
Transport Minister planning to make hi-vis jackets mandatory for cyclists
The minister also spoke of a number of new transport initiatives, such as mandatory use of high visibility jackets by cyclists.
(tags: cycling safety law ireland leo-varadkar)
The Malware That Duped Target Has Been Found
a Windows 'RAM scraper' trojan known as Trojan.POSRAM, which was used to attack the Windows-based point-of-sales systems which the POS terminals are connected to. part of an operation called Kaptoxa. 'The code is based on a previous malicious tool known as BlackPOS that is believed to have been developed in 2013 in Russia, though the new variant was highly customized to prevent antivirus programs from detecting it' ... 'The tool monitors memory address spaces used by specific programs, such as payment application programs like pos.exe and PosW32.exe that process the data embossed in the magnetic strip of credit and debit cards data. The tool grabs the data from memory.' ... 'The siphoned data is stored on the system, and then every seven hours the malware checks the local time on the compromised system to see if it’s between the hours of 10 a.m. and 5 p.m. If so, it attempts to send the data over a temporary NetBIOS share to an internal host inside the compromised network so the attackers can then extract the data over an FTP ... connection.' http://www.pcworld.com/article/2088920/target-credit-card-data-was-sent-to-server-in-russia.html says the data was then transmitted to another US-based server, and from there relayed to Russia, and notes: 'At the time of its discovery, Trojan.POSRAM “had a zero percent antivirus detection rate, which means that fully updated antivirus engines on fully patched computers could not identify the software as malicious,” iSight said.' Massive AV fail.
(tags: kaptoxa trojans ram-scrapers trojan.posram posram point-of-sale security hacks target credit-cards pin ftp netbios smb)
Full iSight report on the Kaptoxa attack on Target
'POS malware is becoming increasingly available to cyber criminals' ... 'there is growing demand for [this kind of malware]'. Watch your credit cards...
(tags: debit-cards credit-cards security card-present attacks kaptoxa ram-scrapers trojans point-of-sale pos malware target)
-
Both Heartland Payment Systems and Hannaford Bros. were in fact certified PCI-compliant while the hackers were in their system. In August 2006, Wal-Mart was also certified PCI-compliant while unknown attackers were lurking on its network. [...] “This PCI standard just ain’t working,” says Litan, the Gartner analyst. “I wouldn’t say it’s completely pointless. Because you can’t say security is a bad thing. But they’re trying to patch a really weak [and] insecure payment system [with it].”
Basically, RAM scrapers have been in use in live attacks, sniffing credentials in the clear, since 2007. Ouch.(tags: ram-scrapers trojans pins pci-dss compliance security gartner walmart target)
ISPAI responds to TD Patrick O'Donovan's bizarre comments regarding "open source browsers"
ISPAI is rather dismayed and somewhat confused by the recent press release issued by Deputy Patrick O’Donovan (FG). He appears to be asking the Oireachtas Communications Committee (of which he is a member) to investigate: “the matter of tougher controls on the use of open source internet browsers and payment systems” which he claims “allow users to remain anonymous for illegal trade of drugs weapons and pornography.” Deputy O’Donovan would do well to ask the advice of industry experts on these matters given that legislating to curtail the use of such legitimate software or services, which may be misused by some, is neither practical nor logical. Whether or not a browser is open source bears no relevance to its ability to be the subject of anonymous use. Indeed, Deputy O’Donovan must surely be confusing and conflating different technical concepts? In tracing illegal activities, Law Enforcement Agencies and co-operating parties will use IP addresses – users’ choice of browser has little relevance to an investigation of criminal activity. Equally, it may be that the Deputy is uncomfortable with the concept of electronic payment systems but these underpin the digital economy which is bringing enormous benefit to Ireland. Yes, these may be misused by criminals but so are cash and traditional banking services. Restricting the growth of innovative financial services is not the solution to tackling cyber criminals who might be operating what he describes as “online supermarkets for illegal goods.” Tackling international cybercrime requires more specialist Law Enforcement resources at national level and improved international police cooperation supported by revision of EU legislation relating to obtaining server log evidence existing in other jurisdictions.
(tags: ispai open-source patrick-o-donovan fine-gael press-releases tor darknet crime)
-
I use it to modify Time Machine’s backup behavior using weighted reservoir sampling. I built Time Warp to preserve important backup snapshots and prevent Time Machine from deleting them.
via Aman. Nifty!(tags: backup python time-machine decay exponential-decay weighting algorithms snapshots ops)
Nominet now filtering .uk domain registrations for 'sex-crime content'
Amazing. Massive nanny-stateism of the 'something must be done' variety, with a 100% false-alarm hit rate, and it's now policy.
'Nominet have made a decision, based on a report by Lord Macdonald QC, that recommends that they check any domain registration that signals sex crime content or is in itself a sex crime. This is screening of domains within 48 hours of registration, and de-registration. The report says that such domains should be reported to the police.' [....] 'The report itself states [...] that in 2013 Nominet checked domains for key words used by the IWF, and as a result reported tens of thousands of domains to IWF for checking, all of which were false positives. Not one was, in fact, related to child sex abuse.'
(tags: filtering nominet false-positives nanny-state uk sex-crimes false-alarms domains iwf)
Tuning advice for HTTPS for nginx and HAProxy
from Ilya Grigorik. nginx version here: http://www.igvita.com/2013/12/16/optimizing-nginx-tls-time-to-first-byte/
A common error when using the Metrics library is to record Timer metrics on things like API calls, using the default settings, then to publish those to a time-series store like Graphite. Here's why this is a problem.
By default, a Timer uses an Exponentially Decaying Reservoir. The docs say:
'A histogram with an exponentially decaying reservoir produces quantiles which are representative of (roughly) the last five minutes of data. It does so by using a forward-decaying priority reservoir with an exponential weighting towards newer data. Unlike the uniform reservoir, an exponentially decaying reservoir represents recent data, allowing you to know very quickly if the distribution of the data has changed.'
This is more-or-less correct -- but the key phrase is 'roughly'. In reality, if the frequency of updates to such a timer drops off, it could take a lot longer, and if you stop updating a timer which uses this reservoir type, it'll never decay at all. The GraphiteReporter will dutifully capture the percentiles, min, max, etc. from that timer's reservoir every minute thereafter, and record those to Graphite using the current timestamp -- even though the data it was derived from is becoming more and more ancient.
Here's a demo. Note the long stretch of 800ms 99th-percentile latencies on the green line in the middle of this chart:
However, the blue line displays the number of events. As you can see, there were no calls to this API for that 8-hour period -- this one was a test system, and the user population was safely at home, in bed. So while Graphite is claiming that there's an 800ms latency at 7am, in reality the 800ms-latency event occurred 8 hours previously.
I observed the same thing in our production systems for various APIs which suffered variable invocation rates; if rates dropped off during normal operation, the high-percentile latencies hung around for far longer than they should have. This is quite misleading when you're looking at a graph for 10pm and seeing a high 99th-percentile latency, when the actual high-latency event occurred hours earlier. On several occasions, this caused lots of user confusion and FUD with our production monitoring, so we needed to fix it.
Here are some potential fixes.
Modify ExponentiallyDecayingReservoir to also call rescaleIfNeeded() inside getSnapshot() -- but based on this discussion, it appears the current behaviour is intended (at least for the mean measurement), so that may not be acceptable. Another risk of this is that it leaves us in a position where the percentiles displayed for time T may actually have occurred several minutes prior to that, which is still misleading (albeit less so).
Switch to sliding time window reservoirs, but those are unbounded in size -- so a timer on an unexpectedly-popular API could create GC pressure and out-of-memory scenarios. It's also the slowest reservoir type, according to the docs. That made it too risky for us to adopt in our production code as a general-purpose Timer implementation.
Update, Dec 2017: as of version 3.2.3 of Dropwizard Metrics, there is a new SlidingTimeWindowArrayReservoir reservoir implementation, which is a drop-in replacement for SlidingTimeWindowReservoir, with much more acceptable memory footprint and GC impact. It costs roughly 128 bits per stored measurement, and is therefore judged to be 'comparable with ExponentiallyDecayingReservoir in terms of GC overhead and performance'. (thanks to Bogdan Storozhuk for the tip)
What we eventually did in our code was to use this Reporter class instead of GraphiteReporter; it clears all Timer metrics' reservoirs after each write to Graphite. This is dumb and dirty, reaching across logical class boundaries, but at the same time it's simple and comprehensible behaviour: with this, we can guarantee that the percentile/min/max data recorded at timestamp T is measuring events in that timestamp's 1-minute window -- not any time before that. This is exactly what you want to see in a time-series graph like those in Graphite, so is a very valuable feature for our metrics, and one that others have noted to be important in comparable scenarios elsewhere.
Here's an example of what a graph like the above should look like (captured from our current staging stack):
Note that when there are no invocations, the reported 99th-percentile latency is 0, and each measurement doesn't stick around after its 1-minute slot.
Another potential bug fix for a related issue, would be to add support to Metrics so that it can use Gil Tene's LatencyUtils package, and its HdrHistogram class, as a reservoir. (Update: however, I don't think this would address the "old data leaking into newer datapoints" problem as fully.) This would address some other bugs in the Exponentially Decaying Reservoir, as Gil describes:
'In your example of a system logging 10K operations/sec with the histogram being sampled every second, you'll be missing 9 out of each 10 actual outliers. You can have an outlier every second and think you have one roughly every 10. You can have a huge business affecting outlier happening every hour, and think that they are only occurring once a day.'
Eek.
Branchless hex-to-decimal conversion hack
via @simonebordet, on the mechanical-sympathy list: ((c & 0x1F) + ((c >> 6) * 0x19) – 0x10)
(tags: hacks one-liners coding performance optimization hex conversion numbers ascii)
A sampling profiler for your daily browsing - Google Groups
via Ilya Grigorik: Chrome Canary now has a built-in, always-on, zero-overhead code profiler. I want this in my server-side JVMs!
(tags: chrome tracing debugging performance profiling google sampling-profiler javascript blink v8)
-
from tonx. Good advice
-
'The web's only open collection of legal contracts and the best way to negotiate and sign documents online'. (via Kowalshki)
(tags: via:kowalshki business documents legal law contracts)
How an emulator-fueled robot reprogrammed Super Mario World on the fly
Suffice it to say that the first minute-and-a-half or so of this [speedrun] is merely an effort to spawn a specific set of sprites into the game's Object Attribute Memory (OAM) buffer in a specific order. The TAS runner then uses a stun glitch to spawn an unused sprite into the game, which in turn causes the system to treat the sprites in that OAM buffer as raw executable code. In this case, that code has been arranged to jump to the memory location for controller data, in essence letting the user insert whatever executable program he or she wants into memory by converting the binary data for precisely ordered button presses into assembly code (interestingly, this data is entered more quickly by simulating the inputs of eight controllers plugged in through simulated multitaps on each controller port).
oh. my. god. This is utterly bananas.(tags: games hacking omgwtfbbq hacks buffer-overrun super-mario snes security)
Nassim Taleb: retire Standard Deviation
Use the mean absolute deviation [...] it corresponds to "real life" much better than the first—and to reality. In fact, whenever people make decisions after being supplied with the standard deviation number, they act as if it were the expected mean deviation.' Graydon Hoare in turn recommends the median absolute deviation. I prefer percentiles, anyway ;)
(tags: statistics standard-deviation stddev maths nassim-taleb deviation volatility rmse distributions)
Mathematical Purity in Distributed Systems: CRDTs Without Fear
Via Tony Finch. Funnily enough, the example describes Swrve: mobile game analytics, backed by a CRDT-based eventually consistent data store ;)
(tags: storage crdts semilattice idempotency commutativity data-structures distcomp eventual-consistency)
-
some good data (and graphs) on baby names (via Ruth)
(tags: via:ruth babies naming graphs dataviz data usa names)
-
Crowdsourcing transcription of some WWI artifacts: 'The story of the British Army on the Western Front during the First World War is waiting to be discovered in 1.5 million pages of unit war diaries. We need your help to reveal the stories of those who fought in the global conflict that shaped the world we live in today.' (via Luke)
Map of Steamship Routes of the World, 1914
massive image. very cool (via burritojustice)
(tags: maps desktop images steamships shipping history 1914 travel world)
Google Fonts recently switched to using Zopfli
Google Fonts recently switched to using new Zopfli compression algorithm: the fonts are ~6% smaller on average, and in some cases up to 15% smaller! [...] What's Zopfli? It's an algorithm that was developed by the compression team at Google that delivers ~3~8% bytesize improvement when compared to gzip with maximum compression. This byte savings comes at a cost of much higher encoding cost, but the good news is, fonts are static files and decompression speed is exactly the same. Google Fonts pays the compression cost once and every clients gets the benefit of smaller download. If you’re curious to learn more about Zopfli: http://bit.ly/Y8DEL4
(tags: zopfli compression gzip fonts google speed optimization)
"Understanding the Robustness of SSDs under Power Fault", FAST '13 [paper]
Horrific. SSDs (including "enterprise-class storage") storing sync'd writes in volatile RAM while claiming they were synced; one device losing 72.6GB, 30% of its data, after 8 injected power faults; and all SSDs tested displayed serious errors including random bit errors, metadata corruption, serialization errors and shorn writes. Don't trust lone unreplicated, unbacked-up SSDs!
(tags: pdf papers ssd storage reliability safety hardware ops usenix serialization shorn-writes bit-errors corruption fsync)
Irish politician calls for ban on "open source browsers"
'Fine Gael TD for Limerick, Patrick O'Donovan has called for tougher controls on the use of open source internet browsers and payment systems which allow users to remain anonymous in the illegal trade of drugs, weapons and pornography.' Amazing. Yes, this is real.
(tags: open-source clueless omgwtfbbq fine-gael ireland fail funny tor inept)
Little-known Apollo 10 incident
'Apollo 10 had a little known incident in flight as evidenced by this transcript.' http://pic.twitter.com/NCZy7OdxDU
(tags: poo turds space spaceflight funny history apollo-10 apollo accidents)
-
As can be guessed, the higher the compression ratio, the more efficient FSE becomes compared to Huffman, since Huffman can't break the "1 bit per symbol" limit. FSE speed is also very stable, under all probabilities. I'm quite please with the result, especially considering that, since the invention of arithmetic coding in the 70's, nothing really new has been brought to this field. This is still beta stuff, so please consider this first release for testing purposes mostly.
Looking forward to this making it into a production release of some form.(tags: compression algorithms via:kragen fse finite-state-entropy-coding huffman arithmetic-coding)
-
A bug in a scheduled OS upgrade script caused live production DB servers to be upgraded while live. Fixes include fixing that script by verifying non-liveness on the host itself, and a faster parallel MySQL binary-log recovery command.
(tags: dropbox outage postmortems upgrades mysql)
Creative Commons event in Dublin this Friday
'Maximising Digital Creativity, Sharing and Innovation', Event organised by Creative Commons Ireland and Faculty of Law, University College Cork, Lecture Theatre, National Gallery of Ireland, Clare Street entrance, Dublin 2, Friday 17 January 2014, 9.45 a.m. to 1 p.m. (via Darius Whelan)
(tags: creative-commons ireland dublin events talks law copyright)
Growing up unvaccinated: A healthy lifestyle couldn’t prevent many childhood illnesses.
I understand, to a point, where the anti-vaccine parents are coming from. Back in the ’90s, when I was a concerned, 19-year-old mother, frightened by the world I was bringing my child into, I was studying homeopathy, herbalism, and aromatherapy; I believed in angels, witchcraft, clairvoyants, crop circles, aliens at Nazca, giant ginger mariners spreading their knowledge to the Aztecs, the Incas, and the Egyptians, and that I was somehow personally blessed by the Holy Spirit with healing abilities. I was having my aura read at a hefty price and filtering the fluoride out of my water. I was choosing to have past life regressions instead of taking antidepressants. I was taking my daily advice from tarot cards. I grew all my own veg and made my own herbal remedies. I was so freaking crunchy that I literally crumbled. It was only when I took control of those paranoid thoughts and fears about the world around me and became an objective critical thinker that I got well. It was when I stopped taking sugar pills for everything and started seeing medical professionals that I began to thrive physically and mentally.
Life on Mars: Irish man signs up for colony mission
Last week, a private space exploration company called Mars One announced that it has shortlisted 1,058 people from 200,000 applicants who wanted to travel to Mars. Roche is the only Irishman on the list. The catch? If he goes, he can never come back.
Mad stuff. Works at the Science Gallery, so a co-worker of a friend, to boot(tags: science-gallery dublin ireland mars-one mars one-way-trips exploration future space science joseph-roche)
UK NHS will soon require GPs pass confidential medical data to third parties
Specifically, unanonymised, confidential, patient-identifying data, for purposes of "admin, healthcare planning, and research", to be held indefinitely, via the HSCIC. Opt-outs may be requested, however
(tags: opt-out privacy medical data healthcare nhs uk data-privacy data-protection)
-
'why the fuck does my fridge need Twitter?'
(tags: twitter funny tech home fridges internet web appliances consume)
Visualisation of the Raft distributed consensus protocol
Very pretty
(tags: consensus raft visualization distributed distcomp algorithms)
Directv DCA2SR0 01 Deca II Connected Home Adapter
a John-Looney-recommended MoCA adapter, allowing legacy coax home wiring to be used to transmit ethernet
(tags: ethernet coax legacy wiring home-networking moca directv)
Bruce Schneier and Matt Blaze on TAO's Methods
An important point:
As scarily impressive as [NSA's TAO] implant catalog is, it's targeted. We can argue about how it should be targeted -- who counts as a "bad guy" and who doesn't -- but it's much better than the NSA's collecting cell phone location data on everyone on the planet. The more we can deny the NSA the ability to do broad wholesale surveillance on everyone, and force them to do targeted surveillance in individuals and organizations, the safer we all are.
(tags: nsa tao security matt-blaze bruce-schneier surveillance tempest)
How the NSA (may have) put a backdoor in RSA’s cryptography: A technical primer
An excellent description of how the Dual_EC_DRBG backdoor works
(tags: surveillance tech dual_ec_drbg nsa rsa security backdoors via:jgc elliptic-curves)
Who Made That Nigerian Scam? - NYTimes.com
The history behind the 419 advance-fee fraud scam.
According to Robert Whitaker, a historian at the University of Texas, an earlier version of the con, known as the Spanish Swindle or the Spanish Prisoner trick, plagued Britain throughout the 19th century.
True facts about Ocean Radiation and the Fukushima Disaster
solid science
(tags: fukushima japan radiation risk ocean disasters sieverts contamination sea fish science)
Packet Flight: Facebook News Feed @8X
good dataviz of a HTTP page load: 'this is a visualization of a Facebook News Feed load from the perspective of the client, over a 3G wireless connection. Different packet types have different shapes and colors.' (via John Harrington)
(tags: via:johnharrington visualization facebook dataviz networking tcp 3g)
URGENT: Input needed on EU copyright consultation - Boing Boing
The EC is looking for feedback -- but not much, and pretty sharpish.
Go to www.copywrongs.eu and answer the questions which are important to you. You do not have to answer all the questions, only the ones that matter to you. [...] The deadline is 5 February 2014. Until then, we should provide the European Commission with as many responses as possible!
Peter Norvig writes a program to play regex golf with arbitrary lists
In response to XKCD 1313. This is excellent. It's reminiscent of my SpamAssassin SOUGHT-ruleset regexp-discovery algorithm, described in http://taint.org/2007/03/05/134447a.html , albeit without the BLAST step intended to maximise pattern length and minimise false positives
(tags: python regex xkcd blast rule-discovery spamassassin rules regexps regular-expressions algorithms peter-norvig)
-
Beautiful d3.js dataviz of wind patterns and forecasts, projected against a vector Earth map
(tags: earth map visualization weather javascript d3.js dataviz wind forecasts maps)
-
Good description of Etsy's take on continuous deployment, committing directly to trunk, hidden with feature-flags, from Rafe Colburn
(tags: continuous-deployment coding agile deployment devops etsy rafe-colburn)
Dogs like to excrete in alignment with the Earth's magnetic field
Dogs preferred to excrete with the body being aligned along the North-south axis under calm magnetic field conditions.
(tags: dogs poo excrement shit magnetic-field earth zoology papers)
Paul Graham and the Manic Pixie Dream Hacker
Under Graham’s influence, Mark [Zuckerberg], like many in Silicon Valley, subscribes to the Manic Pixie Dream Hacker ideal, making self-started teenage hackers Facebook’s most desired recruiting targets, not even so much for their coding ability as their ability to serve as the faces of hacking culture. “Culture fit”, in this sense, is one’s ability to conform to the Valley’s boyish hacker fantasy, which is easier, obviously, the closer you are to a teenage boy. Like the Manic Pixie Dream Girl’s role of existing to serve the male film protagonist’s personal growth, the Manic Pixie Dream Hacker’s job is to embody the dream hacker role while growing the VC’s portfolio. This is why the dream hacker never ages, never visibly develops interests beyond hardware and code, and doesn’t question why nearly all the other people receiving funding look like him. Like the actress playing the pixie dream girl, the pixie dream boy isn’t being paid to question the role for which he has been cast. In this way, for all his supposed “disruptiveness”, the hacker pixie actually does exactly what he is told: to embody, while he can, the ideal hacker, until he is no longer young, mono-focused, and boyish-seeming enough to qualify for the role (at that point, vested equity may allow him to retire). And like in Hollywood, VCs will have already recruited newer, younger ones to play him.
(tags: hackers manic-pixie-dream-girl culture-fit silicon-valley mark-zuckerberg paul-graham y-combinator vc work investment technology recruitment facebook ageism equality sexism)
-
Flapjack aims to be a flexible notification system that handles: Alert routing (determining who should receive alerts based on interest, time of day, scheduled maintenance, etc); Alert summarisation (with per-user, per media summary thresholds); Your standard operational tasks (setting scheduled maintenance, acknowledgements, etc). Flapjack sits downstream of your check execution engine (like Nagios, Sensu, Icinga, or cron), processing events to determine if a problem has been detected, who should know about the problem, and how they should be told.
(tags: flapjack notification alerts ops nagios paging sensu)
We need your help to keep working for European digital rights in 2014
Grim. DRI are facing a 5-figure legal bill from the music industry - they need your donations to avoid shutdown
(tags: donations dri funding amicus-curiae law ireland digital-rights-ireland emi irma)
Replicant: Replicated State Machines Made Easy
The next time you reach for ZooKeeper, ask yourself whether it provides the primitive you really need. If ZooKeeper's filesystem and znode abstractions truly meet your needs, great. But the odds are, you'll be better off writing your application as a replicated state machine.
(tags: zookeeper paxos replicant replication consensus state-machines distcomp)