-
'an elegant and simple HTTP library for Python, built for human beings.' 'Requests is an Apache2 Licensed HTTP library, written in Python, for human beings. Python’s standard urllib2 module provides most of the HTTP capabilities you need, but the API is thoroughly broken. It was built for a different time — and a different web. It requires an enormous amount of work (even method overrides) to perform the simplest of tasks. Requests takes all of the work out of Python HTTP/1.1 — making your integration with web services seamless. There’s no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic, powered by urllib3, which is embedded within Requests.'
Surprisingly Good Evidence That Real Name Policies Fail To Improve Comments
'Enough theorizing, there’s actually good evidence to inform the debate. For 4 years, Koreans enacted increasingly stiff real-name commenting laws, first for political websites in 2003, then for all websites receiving more than 300,000 viewers in 2007, and was finally tightened to 100,000 viewers a year later after online slander was cited in the suicide of a national figure. The policy, however, was ditched shortly after a Korean Communications Commission study found that it only decreased malicious comments by 0.9%. Korean sites were also inundated by hackers, presumably after valuable identities. Further analysis by Carnegie Mellon’s Daegon Cho and Alessandro Acquisti, found that the policy actually increased the frequency of expletives in comments for some user demographics. While the policy reduced swearing and “anti-normative” behavior at the aggregate level by as much as 30%, individual users were not dismayed. “Light users”, who posted 1 or 2 comments, were most affected by the law, but “heavy” ones (11-16+ comments) didn’t seem to mind. Given that the Commission estimates that only 13% of comments are malicious, a mere 30% reduction only seems to clean up the muddied waters of comment systems a depressingly negligent amount. The finding isn’t surprising: social science researchers have long known that participants eventually begin to ignore cameras video taping their behavior. In other words, the presence of some phantom judgmental audience doesn’t seem to make us better versions of ourselves.' (via Ronan Lyons)
(tags: anonymity identity policy comments privacy politics new-media via:ronanlyons)
Category: Uncategorized
HAT-trie: A Cache-conscious Trie-based Data Structure for Strings [PDF]
'Tries are the fastest tree-based data structures for managing strings in-memory, but are space-intensive. The burst-trie is almost as fast but reduces space by collapsing trie-chains into buckets. This is not however, a cache-conscious approach and can lead to poor performance on current processors. In this paper, we introduce the HAT-trie, a cache-conscious trie-based data structure that is formed by carefully combining existing components. We evaluate performance using several real-world datasets and against other highperformance data structures. We show strong improvements in both time and space; in most cases approaching that of the cache-conscious hash table. Our HAT-trie is shown to be the most e?cient trie-based data structure for managing variable-length strings in-memory while maintaining sort order.' (via Tony Finch)
(tags: via:fanf data-structures tries cache-aware trees)
The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases [PDF]
'Main memory capacities have grown up to a point where most databases ?t into RAM. For main-memory database systems, index structure performance is a critical bottleneck. Traditional in-memory data structures like balanced binary search trees are not ef?cient on modern hardware, because they do not optimally utilize on-CPU caches. Hash tables, also often used for main-memory indexes, are fast but only support point queries. To overcome these shortcomings, we present ART, an adaptive radix tree (trie) for ef?cient indexing in main memory. Its lookup performance surpasses highly tuned, read-only search trees, while supporting very ef?cient insertions and deletions as well. At the same time, ART is very space ef?cient and solves the problem of excessive worst-case space consumption, which plagues most radix trees, by adaptively choosing compact and ef?cient data structures for internal nodes. Even though ART’s performance is comparable to hash tables, it maintains the data in sorted order, which enables additional operations like range scan and pre?x lookup.' (via Tony Finch)
(tags: via:fanf data-structures trees indexing cache-aware tries)
Ef?cient In-Memory Indexing with Generalized Pre?x Trees [PDF]
'Ef?cient data structures for in-memory indexing gain in importance due to (1) the exponentially increasing amount of data, (2) the growing main-memory capacity, and (3) the gap between main-memory and CPU speed. In consequence, there are high performance demands for in-memory data structures. Such index structures are used—with minor changes—as primary or secondary indices in almost every DBMS. Typically, tree-based or hash-based structures are used, while structures based on prefix-trees (tries) are neglected in this context. For tree-based and hash-based structures, the major disadvantages are inherently caused by the need for reorganization and key comparisons. In contrast, the major disadvantage of trie-based structures in terms of high memory consumption (created and accessed nodes) could be improved. In this paper, we argue for reconsidering pre?x trees as in-memory index structures and we present the generalized trie, which is a pre?x tree with variable prefix length for indexing arbitrary data types of fixed or variable length. The variable prefix length enables the adjustment of the trie height and its memory consumption. Further, we introduce concepts for reducing the number of created and accessed trie levels. This trie is order-preserving and has deterministic trie paths for keys, and hence, it does not require any dynamic reorganization or key comparisons. Finally, the generalized trie yields improvements compared to existing in-memory index structures, especially for skewed data. In conclusion, the generalized trie is applicable as general-purpose in-memory index structure in many different OLTP or hybrid (OLTP and OLAP) data management systems that require balanced read/write performance.' (via Tony Finch)
(tags: via:fanf prefix-trees tries data-structures)
A Non-Blocking HashTable by Dr. Cliff Click : programming
Proggit discovers the NonBlockingHashMap. This comment from Boundary's cscotta is particularly interesting: "The code is intricate and curiously-formatted, but NBHM is quite excellent. The majority of our analytics platform is backed by NBHMs updated rapidly in parallel. Cliff's a great, friendly, approachable guy; if you have any specific questions about the approaches or implementation, he may be happy to answer."
(tags: data-structures algorithms non-blocking concurrency threading multicore cliff-click azul maps java boundary)
English Letter Frequency Counts: Mayzner Revisited or ETAOIN SRHLDCU
Amazing how consistent the n-gram counts are between Peter Norvig's analysis (here) against the 20120701 Google Books corpus, and Mark Mayzner's 20,000-word corpus from the early 1960s
(tags: english statistics n-grams words etaoin-shrdlu peter-norvig mark-mayzner)
Solving monitoring state storage problems using Redis
a nice basic Redis-in-practice post
-
'a HTTP client mock library for Python, 100% inspired on ruby's FakeWeb [ https://github.com/chrisk/fakeweb ].' 'HTTPretty monkey patches Python's socket core module, reimplementing the HTTP protocol by mocking requests and responses.'
(tags: mocking testing http python ruby unit-tests tests monkey-patching)
Why did infinite scroll fail at Etsy?
'A/B testing must be done in a modularized fashion. The “fail” case he gave was when Etsy spent months developing and testing infinite scroll to their search listings, only to find that it had a negative impact on engagement.' [...] 'instead of having the goal of “test infinite scroll,” Etsy realized it needed to test each assumption separately, and this going forward is their game plan.'
(tags: usability testing design etsy ab-testing test modularization via:hn)
-
Annotations-based git-like CLI helper for Java
"Matters Computational - Ideas, Algorithms, Source Code"
A hefty tome (in PDF format) containing lots of interesting algorithms and computational tricks; code is GPLv3 licensed
(tags: algorithms computation via:cliffc pdf books coding)
Dan McKinley :: Effective Web Experimentation as a Homo Narrans
Good demo from Etsy's A/B testing, of how the human brain can retrofit a story onto statistically-insignificant results. To fix: 'avoid building tooling that enables fishing expeditions; limit our post-hoc rationalization by explicitly constraining it before the experiment. Whenever we test a feature on Etsy, we begin the process by identifying metrics that we believe will change if we 1) understand what is happening and 2) get the effect we desire.'
(tags: testing etsy statistics a-b-testing fishing ulysses-contract brain experiments)
Lesser known crimes: do you own that copyright?
A very interesting crime on the Irish statute books:
Section 141 of the Copyright and Related Rights Act 2000 provides: A person who, for financial gain, makes a claim to enjoy a right under this Part [ie. copyright] which is, and which he or she knows or has reason to believe is, false, shall be guilty of an offence and shall be liable on conviction on indictment to a fine not exceeding £100,000, or to imprisonment for a term not exceeding 5 years, or both.
(tags: ireland copyright ip false-claims law)
-
Makers of "Feck It Sure It's Grand" merchandise are flogging stuff for 50 cent (+ shipping) in their "out with the old" sale
Patent trolls want $1,000 for using scanners
We are truly living in the future -- a dystopian future, but one nonetheless. A patent troll manages to obtain "gobbledigook" patents on using a scanner to scan to PDF, then attempts to shake down a bunch of small companies before eventually running into resistance, at which point it "forks" into a bunch of algorithmically-named shell companies, spammer-style, sending the same demands. Those demands in turn contain this beauty of Stockholm-syndrome-inducing prose:
'You should know also that we have had a positive response from the business community to our licensing program. As you can imagine, most businesses, upon being informed that they are infringing someone’s patent rights, are interested in operating lawfully and taking a license promptly. Many companies have responded to this licensing program in such a manner. Their doing so has allowed us to determine that a fair price for a license negotiated in good faith and without the need for court action is a payment of $900 per employee. We trust that your organization will agree to conform your behavior to respect our patent rights by negotiating a license rather than continuing to accept the benefits of our patented technology without a license. Assuming this is the case, we are prepared to make this pricing available to you.'
And here's an interesting bottom line:The best strategy for target companies? It may be to ignore the letters, at least for now. “Ignorance, surprisingly, works,” noted Prof. Chien in an e-mail exchange with Ars. Her study of startups targeted by patent trolls found that when confronted with a patent demand, 22 percent ignored it entirely. Compare that with the 35 percent that decided to fight back and 18 percent that folded. Ignoring the demand was the cheapest option ($3,000 on average) versus fighting in court, which was the most expensive ($870,000 on average). Another tactic that clearly has an effect: speaking out, even when done anonymously. It hardly seems a coincidence that the Project Paperless patents were handed off to a web of generic-sounding LLCs, with demand letters signed only by “The Licensing Team,” shortly after the “Stop Project Paperless” website went up. It suggests those behind such low-level licensing campaigns aren’t proud of their behavior. And rightly so.
(tags: patents via:fanf networks printing printers scanning patent-trolls project-paperless adzpro gosnel faslan)
Keep predicting and you’ll be right eventually?
debunking Ken Ring, the kiwi “long term weather prediction” “scientist” who gets trundled out every year around this time
(tags: ken-ring weather predictions ireland rain)
29c3 HashDOS presentation slides (PDF)
Summary: MurmurHash still vulnerable, likewise Cityhash and Python's hash -- use SipHash
(tags: via:fanf cityhash siphash hash dos security hashdos murmurhash)
Scaling Crashlytics: Building Analytics on Redis 2.6
How one analytics/metrics co is using Redis on the backend
(tags: analytics redis presentation metrics)
Systemd, systemd-nspawn, and namespaces for Linux service compartmentalization
"Using ReadOnlyDirectories= andInaccessibleDirectories= you may setup a file system namespace jail for your service. Initially, it will be identical to your host OS' file system namespace. By listing directories in these directives you may then mark certain directories or mount points of the host OS as read-only or even completely inaccessible to the daemon."
(tags: security systemd jails namespaces linux compartmentalisation)
GNUTLS project is leaving/attempting to leave the GNU project/FSF
seems there's trouble around governance and rights of the project's developers. GNU sed and grep's maintainer, too: http://article.gmane.org/gmane.comp.lang.smalltalk.gnu.general/7873
(tags: gnu software free free-software governance copyright management richard-stallman)
Raspberry Pi XBMC Shootout: Raspbmc vs OpenELEC vs XBian
summary: OpenELEC wins! (via Davie Norrie)
(tags: via:dnorrie openelec xbmc raspberry-pi tv htpc linux)
-
the enStratus “solo installer”; what they use for one-box testing, staging, and customer stack deployment, using chef-solo and Vagrant
(tags: chef virtualization vagrant chef-solo deployment enstratus cluster stack)
-
'thin software layers don’t add much value, especially when you have many such layers piled on each other. Each layer has to be pushed onto your mental stack as you dive into the code. Furthermore, the layers of phyllo dough are permeable, allowing the honey to soak through. But software abstractions are best when they don’t leak. When you pile layer on top of layer in software, the layers are bound to leak.'
(tags: code design terminology food antipatterns)
The innards of Evernote's new business analytics data warehouse
replacing a giant MySQL star-schema reporting server with a Hadoop/Hive/ParAccel cluster
(tags: horizontal-scaling scalability bi analytics reporting evernote via:highscalability hive hadoop paraccel)
HBase Real-time Analytics & Rollbacks via Append-based Updates
Interesting concept for scaling up the write rate on massive key-value counter stores:
'Replace update (Get+Put) operations at write time with simple append-only writes and defer processing of updates to periodic jobs or perform aggregations on the fly if user asks for data earlier than individual additions are processed. The idea is simple and not necessarily novel, but given the specific qualities of HBase, namely fast range scans and high write throughput, this approach works very well.'
(tags: counters analytics hbase append sematext aggregation big-data)
Cliff Click in "A JVM Does What?"
interesting YouTubed presentation from Azul's Cliff Click on some java/JVM innards
(tags: presentation concurrency jvm video java youtube cliff-click)
-
An attempt to catalogue some Emergency-era (ie. WWII) ground markings, used to notify US pilots that they were overflying the neutral Republic of Ireland
(tags: ireland eire history wwii the-emergency war geography mapping)
Bunnie Huang is building a once-off custom laptop design
As one commenter says, "it's like watching a Jedi construct his own light-saber.” Quad-core ARM chips, on-board FPGA (!), and lots of other amazing hacker-friendly features; sounds like a one-of-a-kind device
Authentication is machine learning
This may be the most insightful writing about authentication in years:
(via Tony Finch.)From my brief time at Google, my internship at Yahoo!, and conversations with other companies doing web authentication at scale, I’ve observed that as authentication systems develop they gradually merge with other abuse-fighting systems dealing with various forms of spam (email, account creation, link, etc.) and phishing. Authentication eventually loses its binary nature and becomes a fuzzy classification problem.
This is not a new observation. It’s generally accepted for banking authentication and some researchers like Dinei Florêncio and Cormac Herley have made it for web passwords. Still, much of the security research community thinks of password authentication in a binary way [..]. Spam and phishing provide insightful examples: technical solutions (like Hashcash, DKIM signing, or EV certificates), have generally failed but in practice machine learning has greatly reduced these problems. The theory has largely held up that with enough data we can train reasonably effective classifiers to solve seemingly intractable problems.
(tags: passwords authentication big-data machine-learning google abuse antispam dkim via:fanf)
Hotels to pay royalties on music - The Irish Times - Fri, Dec 14, 2012
'The operators of hotels, guesthouses and bed & breakfasts will have to pay royalties for any copyright music played in guest bedrooms [in Ireland]. [...] Under the agreement, the music charges will be set by Phonographic Performance Ireland Ltd (PPI). [...] When it initiated its case in 2010, the PPI said it was seeking payment of about €1 per bedroom per week or about 14 cent a night.' I don't understand this. Most hotels do not play music in the rooms themselves. Does this apply if there is no music playing in the bedroom? Does it apply if the customer brings their own music? Are Dublin Bus to be next?
-
'The trouble with the Lisp-hacker tradition is that it is overly focused on the problem of programming -- compilers, abstraction, editors, and so forth -- rather than the problems outside the programmer's cubicle. I conjecture that the Lisp-school essayists -- Raymond, Graham, and Yegge -- have not “needed mathematics” because they spend their time worrying about how to make code more abstract. This kind of thinking may lead to compact, powerful code bases, but in the language of economics, there is an opportunity cost.'
The Aggregate Magic Algorithms
Obscure, low-level bit-twiddling tricks -- specifically:
Absolute Value of a Float, Alignment of Pointers, Average of Integers, Bit Reversal, Comparison of Float Values, Comparison to Mask Conversion, Divide Rounding, Dual-Linked List with One Pointer Field, GPU Any, GPU SyncBlocks, Gray Code Conversion, Integer Constant Multiply, Integer Minimum or Maximum, Integer Power, Integer Selection, Is Power of 2, Leading Zero Count, Least Significant 1 Bit, Log2 of an Integer, Next Largest Power of 2, Most Significant 1 Bit, Natural Data Type Precision Conversions, Polynomials, Population Count (Ones Count), Shift-and-Add Optimization, Sign Extension, Swap Values Without a Temporary, SIMD Within A Register (SWAR) Operations, Trailing Zero Count.
Many of these would be insane to use in anything other than the hottest of hot-spots, but good to have on file. (via Toby diPasquale)(tags: hot-spots optimisation bit-twiddling algorithms via:codeslinger snippets)
Shell Scripts Are Like Gremlins
Shell Scripts are like Gremlins. You start out with one adorably cute shell script. You commented it and it does one thing really well. It’s easy to read, everyone can use it. It’s awesome! Then you accidentally spill some water on it, or feed it late one night and omgwtf is happening!?
+1. I have to wean myself off the habit of automating with shell scripts where a clean, well-unit-tested piece of code would work better.(tags: shell-scripts scripting coding automation sysadmin devops chef deployment)
-
"there's a nostalgic cost: your old original 386 DX33 system from early 1991 won't be able to boot modern Linux kernels anymore. Sniff." Now *THAT* is backwards compatibility.
(tags: linux backwards-compatibility 386 history linus-torvalds)
-
'The results are startlingly good. This 3D printed skull [see pic] looks almost real. This is the print quality everyone will be able to access when Mcor’s deal with Staples enables 3D printing from copy centers.'
BBC News - The hum that helps to fight crime
'Dr Harrison said: "If we have we can extract [the hum of the mains AC power's 50Hz wave] and compare it with the database, if it is a continuous recording, it will all match up nicely. "If we've got some breaks in the recording, if it's been stopped and started, the profiles won't match or there will be a section missing. Or if it has come from two different recordings looking as if it is one, we'll have two different profiles within that one recording." In the UK, because one national grid supplies the country with electricity, the fluctuations in frequency are the same the country over. So it does not matter if the recording has been made in Aberdeen or Southampton, the comparison will work.'
Two Sides For Salvation « Code as Craft
Etsy's MySQL master-master pair configuration, and how it allows no-downtime schema changes
(tags: database etsy mysql replication schema availability downtime)
GMail partial outage - Dec 10 2012 incident report [PDF]
TL;DR: a bad load balancer change was deployed globally, causing the impact. 21 minute time to detection. Single-location rollout is now on the cards
-
lovely signed and editioned prints by Dublin's best illustrators at good prices. Turns out this was in connection with a show a few days ago, so the best ones are now sold out -- I love the Chris Judge Liberty Hall print -- but there's still a few good ones left. Brian Gallagher's Georgian doorway is a beauty.
(tags: illustration dublin prints art chris-judge)
-
via Come Here To Me -- 'The whole population of the county at the time was under 60,000. Ringsend, Merrion, Monkstown, Bullock and Dalkey on the Southside and Ballybough, Clontarf, Sutton and Hoath/Howth on the Northside are marked. Taken from the book Dublin: through space and time (2001).'
Massive tracts of land were reclaimed since then, clearly -- the North bay comes all the way in to Ballybough!
Back-up Tut and other decoy spatial antiquities
I like this idea -- a complete facsimile of King Tut's burial chamber. Bldgblog comments:
“On the 90th anniversary of the discovery of King Tut’s tomb, an “authorized facsimile of the burial chamber” has been created, complete “with sarcophagus, sarcophagus lid and the missing fragment from the south wall.” The resulting duplicate, created with the help of high-res cameras and lasers, is “an exact facsimile of the burial chamber,” one that is now “being sent to Cairo by The Ministry of Tourism of Egypt.” [...]
'Interestingly, we read that this was "done under a licence to the University of Basel," which implies the very real possibility that unlicensed duplicate rooms might also someday be produced—that is, pirate interiors ripped or printed from the original data set, like building-scale "physibles," a kind of infringed architecture of object torrents taking shape as inhabitable rooms.' [...]
'In their book Anachronic Renaissance, for instance, Alexander Nagel and Christopher Wood write of what they call a long "chain of effective substitutions" or "effective surrogates for lost originals" that nonetheless reached the value and status of an icon in medieval Europe. "[O]ne might know that [these objects] were fabricated in the present or in the recent past," Nagel and Wood write, "but at the same time value them and use them as if they were very old things." They call this seeing in "substitutional terms".'
(tags: via:new-aesthetic bldgblog archaeology facsimiles copying king-tut egypt history 3d-printing physibles)
-
"This project aims at creating a simple efficient building block for "Big Data" libraries, applications and frameworks; thing that can be used as an in-memory, bounded queue with opaque values (sequence of JDK primitive values): insertions at tail, removal from head, single entry peeks), and that has minimal garbage collection overhead. Insertions and removals are as individual entries, which are sub-sequences of the full buffer. GC overhead minimization is achieved by use of direct ByteBuffers (memory allocated outside of GC-prone heap); and bounded nature by only supporting storage of simple primitive value (byte, `long') sequences where size is explicitly known. Conceptually memory buffers are just simple circular buffers (ring buffers) that hold a sequence of primitive values, bit like arrays, but in a way that allows dynamic automatic resizings of the underlying storage. Library supports efficient reusing and sharing of underlying segments for sets of buffers, although for many use cases a single buffer suffices."
(tags: gc java jvm bytebuffer)
The "MIG-in-the-middle" attack
or, a very effective demonstration of a man-in-the-middle interception and replay attack, from a 1980s Namibia-Angola war, via Ross Anderson
Scoop! The inside story of the news website that saved the BBC
The Register's take on the early days of www.bbc.co.uk. Lots of politics, unsurprisingly.
Fifteen years ago this month the BBC launched its News Online website. Developed internally with a skeleton team, the web service rapidly became the face of the BBC on the internet, and its biggest success story – winning four successive BAFTA awards. Remarkably, it operated at a third of the cost of rival commercial online news operations – unheard of in public-sector IT projects. Devised before there were really any content management systems, the technical architecture became a template for all major news systems, and one that’s still in use today. The team endured some furious internal politicking and sabotage to survive.
Irish mobile phone companies: still spammy
'Pro tip: if you're going to spam, try not to spam the DPC's Director of Investigations.' -- lolz
(tags: funny oh-dear three hutchinson ireland mobile spam dpc law)
-
Wikipedia page.
The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a string of bits, this is the number of 1's in the string. In this binary case, it is also called the population count, popcount or sideways sum. It is the digit sum of the binary representation of a given number.
Contains an efficient algorithm to compute this for a given long value, by 'adding counts in a tree pattern.'(tags: algorithms hamming-distance bits hamming weight binary)
Efficient concurrent long set and map
An ordered set and map data structure and algorithm for long keys and values, supporting concurrent reads by multiple threads and updates by a single thread.
Some good stuff in the linked blog posts about Clojure's PersistentHashMap and PersistentVector data structures, too.(tags: arrays java tries data-structures persistent clojure concurrent set map)
James Hamilton - Failures at Scale & How to Ride Through Them - AWS re:Invent 2012 - Cpn208
mostly an update of his classic USENIX paper, but pretty cool to come across a mention of a network monitoring system we've built on page 21 ;)
(tags: amazon james-hamilton reliabilty slides aws)
_The Pauseless GC Algorithm_ [pdf]
Paper from USENIX VEE '05, by Cliff Click, Gil Tene, and Michael Wolf of Azul Systems, describing some details of the Azul secret sauce (via b6n)
Everything I Ever Learned About JVM Performance Tuning @Twitter
presentation by Attila Szegedi of Twitter from last year. Some good tips here, well-presented
The Rise And Fall Of The Obscure Music Download Blog: A Roundtable
One internet music "sharing" trend largely unnoticed by the powers that sue was the niche explosion of obscure music download blogs, lasting roughly from 2004-2008. Using free filesharing services like Rapidshare and Mediafire, and setting up sites on Blogspot and similar providers, these internet hubs stayed hidden in the open by catering to more discerning kleptomaniac audiophiles. Their specialty: parceling out ripped recordings — many of them copyrighted — from the more collectible and unknown corners of music's oddball, anomalous past. While the RIAA was suing dead people for downloading Michael Jackson songs (and Madonna was using Soulseek to curse at teenagers), obscure music blogs racked up millions of hits, ripping and sharing 80s Japanese noise, 70s German prog, 60s San Francisco hippie freak-outs, 50s John Cage bootlegs, 30s gramophone oddities, Norwegian death metal, cold wave cassettes made by kids in their garages, and the like. It was the mid aughts, and the advent of digitization had inadvertently put the value of the music industry's "Top Ten" commercial product in peril. That same process transformed the value of old, collectible music as well. If one smart record collector was able to share the entire contents—music, artwork and all—of one vinyl LP on his blog, for free, and upload another item from his 1,000+ collection the next day, for weeks and years, and others like him did the same, competing with each other about who could upload the rarest and most sought-after record, and anyone who downloaded it could then share it again and again… Suddenly everyone in the world had the coolest record collection in the world; and soon, nobody in the world had the coolest record collection in the world. Obscure music download blogs weren't shut down like Napster or Megaupload were (though they were indirectly affected by that crackdown); they just, mysteriously, seemed to burn out on their own sometime around 2008. While some are still around, their number represents only a fraction of that mid-00s heyday. Was this because obscure music blogs had overshared the underexposed and blown the whole thing into oblivion? Is the fact that a guy in Japan will no longer pay $500 on eBay for a first pressing of the No New York compilation because he can find it for free on the internet good for the world? Was the commodity-lost but the knowledge-gained an even exchange? To explore what was going on then, I assembled this email roundtable discussion between creators of some of the most popular blogs of the time: Eric Lumbleau of Mutant Sounds, Liam Elms of 8 Days in April, Frank of Systems of Romance and Brian Turner, Music Director of WFMU.
(via Loreana Rushe)(tags: music mp3 blogs obscure via-loreana-rushe history 2000s)
Conor’s 2012 Raspberry Pi Christmas Gift Guide
Ah, memories! Wish my kiddies were old enough for one of these...
I really think this Christmas could be a lovely replay of 1982 for a lot of people, like me, who got their first home computer that year. You could have so much fun on Christmas Day messing with the RPi rather than falling asleep in front of the fire. Just don’t fight over who gets the telly when Doctor Who is on. Whilst the bare-bones nature of the Raspberry Pi is wonderful, it is unusable out of the box unless you are a house with smartphones, digital cameras and existing PCs already that you can raid for components. What you want to avoid is a repeat of me that December in 1982 with my brand-new 16K ZX Spectrum which didn’t work on our Nordmende TV until two weeks later when the RTV Rentals guy came and replaced the TV Tuner. Two weeks typing Beep 1,2 to make sure it wasn’t broken.
(tags: raspberry-pi gifts computers kids hacking education gadgets christmas)
Nintendo's work on Miiverse Penis Drawing Detection
'The unique feature of the Miiverse is being able to send drawings, not just text. But since the advent of the internet, there have always been those who have used it for unsavory purposes.'
See also the "time-to-penis" metric in MMO games: http://www.joystiq.com/2009/03/24/overheard-gdc09-ttp-time-to-penis/
'Motoyama: we never had such a problem with our Hatena services. But, when we brought Hatena Flipnote to the West, we were caught off-guard by the amount of penises drawn by people.
Kurisu: So the team and I had to come up with a way to create a system that auto-detects those types of pictures. [...]
'Motoyama: After a week, we made very good progress on the system. Then we tested the system with Nintendo of America and told them to start drawing. It went horribly.
Kurisu: What we learned is that people enjoy drawing penises. Multiple ones. (laughs) The system was not prepared to handle that.'(tags: nintendo image-detection ttp metrics games gaming mmo miiverse drawing)
The trench talk that is now entrenched in the English language
'From cushy to crummy and blind spot to binge drink, a new study reveals the impact the First World War had on the English language and the words it introduced.' Incredible comments, too...
(tags: english etymology history wwi great-war via:sinead-gleeson words language)
Special encoding of small aggregate data types in Redis
Nice performance trick in Redis on hash storage: 'In theory in order to guarantee that we perform lookups in constant time (also known as O(1) in big O notation) there is the need to use a data structure with a constant time complexity in the average case, like an hash table. But many times hashes contain just a few fields. When hashes are small we can instead just encode them in an O(N) data structure, like a linear array with length-prefixed key value pairs. Since we do this only when N is small, the amortized time for HGET and HSET commands is still O(1): the hash will be converted into a real hash table as soon as the number of elements it contains will grow too much (you can configure the limit in redis.conf). This does not work well just from the point of view of time complexity, but also from the point of view of constant times, since a linear array of key value pairs happens to play very well with the CPU cache (it has a better cache locality than an hash table).'
(tags: memory redis performance big-o hash-tables storage coding cache arrays)
HTTP Error 403: The service you requested is restricted - Vodafone Community
Looks like Vodafone Ireland are failing to scale their censorware; clients on their network reporting "HTTP Error 403: The service you requested is restricted". According to a third-party site, this error is produced by the censorship software they use when it's insufficiently scaled for demand:
"When you try to use HTTP Vodafone route a request to their authentication server to see if your account is allow to connect to the site. By default they block a list of adult/premium web sites (this is service you have switched on or off with your account). The problem is at busy times this validation service is overloaded and so their systems get no response as to whether the site is allowed, so assume the site you asked for is restricted and gives the 403 error. Once this happens you seem to have to make new 3G data connection (reset the phone, move cell or let the connection time out) to get it to try again."
Sample: http://pic.twitter.com/N1lAwBjW(tags: scaling ireland vodafone fail censorware scalability customer-service)
Does it run Minecraft? Well, since you ask…
Going by the number of Minecraft fans among my friends' sons and daughters in the 8-12 age group, this is a great idea:
We sent a bunch of [Raspberry Pi] boards out to Notch and the guys at Mojang in Stockholm a little while back, and they’ve produced a port of Minecraft: Pocket Edition which they’re calling Minecraft: Pi Edition. It’ll carry a revised feature set and support for several programming languages, so you can code direct into Minecraft before you start playing. (Or you can just – you know – play.)
(tags: minecraft gaming programming coding raspberry-pi kids learning education)
-
Martin Thompson with a good description of the x86 memory barrier model and how it interacts with Java's JSR-133 memory model
(tags: architecture hardware programming java concurrency volatile jsr-133)
IBM insider: How I caught my wife while bug-hunting on OS/2 • The Register
Wow, working for IBM in the 80's was truly shitty.
'IBM HR came up with a plan that summed up the department's view of tech staff: a dinner dance. In Southsea. For our non-British readers this is not a glamorous location. As a scumbag contractor I wasn’t invited, but since I was dating one of the seven women on the project, I went anyway and was impressed by the way IBM had tried so very hard to make the inside of a municipal leisure centre look like Hawaii. This is so crap that the integrity checks I’ve installed to watch myself for incipient senility keep flagging it as a false memory. The only way I can force myself to believe the idea that the richest corporation on the planet behaved that way is that the girl who took me is now a reassuringly expensive lawyer who was kind enough to marry me and so we have photographic evidence. (I wish to make it clear that I’m not saying IBM had the worst HR of any firm in the world, merely that my 28 years in technology and banking have never exposed a worse one to me.)'
And indeed, so were MS:'We, on the other hand, were regarded as hopelessly bureaucratic. After Microsoft lost the source code for the actual build of OS/2 we shipped, I reported a bug triggered when you double-clicked on Chkdsk twice: the program would fire up twice and both would try to fix the disk at the same time, causing corruption. I noted that this “may not be consistent with the user's goals as he sees them at this time”. This was labelled a user error, and some guy called Ballmer questioned why I had this “obsession” with perfect code.'
(thanks, Conor!)(tags: via:conor-delaney os2 ibm microsoft work 1980s pc uk steve-ballmer)
How Team Obama’s tech efficiency left Romney IT in dust | Ars Technica
The web-app dev and ops best practices used by the Obama campaign's tech team. Some key tools: Puppet, EC2, Asgard, Cacti, Opsview, StatsD, Graphite, Seyren, Route53, Loggly, etc.
Tumblr Architecture - 15 Billion Page Views A Month And Harder To Scale Than Twitter
Buckets of details on Tumblr's innards. fans of Finagle and Kafka, notably
John Carmack's .plan update from 10/14/98
John Carmack presciently defines the benefits of an event sourcing architecture in 1998, as a key part of Quake 3's design:
"The key point: Journaling of time along with other inputs turns a realtime application into a batch process, with all the attendant benefits for quality control and debugging. These problems, and many more, just go away. With a full input trace, you can accurately restart the session and play back to any point (conditional breakpoint on a frame number), or let a session play back at an arbitrarily degraded speed, but cover exactly the same code paths."
(This was the first time I'd heard of the concept, at least.)(tags: john-carmack design software coding event-sourcing events quake-3)
-
Unlike other tools intended to solve the JVM startup problem (e.g. Nailgun, Cake), Drip does not use a persistent JVM. There are many pitfalls to using a persistent JVM, which we discovered while working on the Cake build tool for Clojure. The main problem is that the state of the persistent JVM gets dirty over time, producing strange errors and requiring liberal use of cake kill whenever any error is encountered, just in case dirty state is the cause. Instead of going down this road, Drip uses a different strategy. It keeps a fresh JVM spun up in reserve with the correct classpath and other JVM options so you can quickly connect and use it when needed, then throw it away. Drip hashes the JVM options and stores information about how to connect to the JVM in a directory with the hash value as its name.
(via HN)(tags: java command-line tools startup speed)
Australian VCE exam question accidentally includes photoshopped Battletech mech
File under New Aesthetic:
Exams for the popular History: Revolution subject were original supposed to include the artwork Storming the Winter palace on 25th October 1917 by Nikolai Kochergin, which depicts events during the October Revolution, which was instrumental in the larger Russian Revolution of 1917. When students opened their exam this morning they found an altered version of the work with what appear to be a large "BattleTech Marauder" robot aiding the rising revolutionaries in the background.
(tags: new-aesthetic funny photoshop russia 1917 battletech mechs vcaa)
Building an Impenetrable ZooKeeper (PDF)
great presentation on operational tips for a reliable ZK cluster (via Bill deHora)
(tags: via:bill-dehora zookeeper ops syadmin)
WebTechStacks by martharotter - Kippt
A good set of infrastructure/devops tech blogs, collected by Martha Rotter
(tags: via:martharotter blogs infrastructure devops ops web links)
What can data scientists learn from DevOps?
Interesting. 'Rather than continuing to pretend analysis is a one-time, ad hoc action, automate it. [...] you need to maintain the automation machinery, but a cost-benefit analysis will show that the effort rapidly pays off — particularly for complex actions such as analysis that are nontrivial to get right.' (via @fintanr)
(tags: via:fintanr data-science data automation devops analytics analysis)
AnandTech - The Intel SSD DC S3700: Intel's 3rd Generation Controller Analyzed
Interesting trend; Intel moved from a btree to an array-based data structure for their logical-block address indirection map, in order to reduce worst-case latencies (via Martin Thompson)
(tags: latency intel via:martin-thompson optimization speed p99 data-structures arrays btrees ssd hardware)
Java tip: How to get CPU, system, and user time for benchmarking
a neat MXBean trick to get per-thread CPU usage in a running JVM (via Tatu Saloranta)
-
'I'd really prefer not to fork the language; I'd much rather collectively help carry the banner of Markdown forward into the future, with the blessing of John Gruber and in collaboration with other popular sites that use Markdown. So... who's with me?'
SipHash: a fast short-input PRF
a family of pseudorandom functions optimized for short inputs. Target applications include network traffic authentication and hash-table lookups protected against hash-flooding denials-of-service attacks. SipHash is simpler than MACs based on universal hashing, and faster on short inputs. Compared to dedicated designs for hash-table lookup, SipHash has well-defined security goals and competitive performance. For example, SipHash processes a 16-byte input with a fresh key in 140 cycles on an AMD FX-8150 processor, which is much faster than state-of-the-art MACs.
(tags: hashing siphash djb security algorithms)
#AltDevBlogADay » Functional Programming in C++
John Carmack makes a case for writing C++ in an FP style, with wide use of const and pure functions. something similar can be achieved in pure Java using Guava's Immutable types, to a certain extent. I love his other posts on this site -- he argues persuasively for static code analysis and keeping multiple alternative subsystem implementations, too
(tags: c++ programming functional-programming fp coding john-carmack const immutability)
Fuchsia MacAree — A-Z of Untranslatable Words
Lovely poster by fantastic Irish illustrator Fuchsia MacAree, who's launching her first exhibition of art and drawings at the Bernard Shaw tonight. See also "Learn To Swear With Captain Haddock": http://fuchsiamacaree.bigcartel.com/product/captain-haddock-print
(tags: want art prints fuchsia-macaree words etymology home)
-
Encyclopedic post from John Allspaw (of Etsy) on the topic, with an "Obligatory [List Of] Pithy Characteristics"
How to make a security geek feel very old: #Factorisation, #DKIM and @DrZacharyHarris
“A 384-bit key I can factor on my laptop in 24 hours. The 512-bit keys I can factor in about 72 hours using Amazon Web Services for $75. And I did do a number of those. Then there are the 768-bit keys. Those are not factorable by a normal person like me with my resources alone. But the government of Iran probably could, or a large group with sufficient computing resources could pull it off.” Remember when we thought 512-bit keys would be enough? how time flies! Of course, John Aycock raised this problem back in 2007, although he assumed it'd take a 100,000-host botnet to crack them (in 153 minutes).
(tags: factorisation moores-law cpu speed dkim domain-keys 512-bit cracking security via:alec-muffet)
Data distribution in the cloud with Node.js
Very interesting presentation from ex-IONAian Darach Ennis of Push Technology on eep.js, embedded event processing in Javascript for node.js stream processing. Handles tumbling, monotonic, periodic and sliding windows at 8-40 million events per second; no multi-dimensional, infinite or predicate event-processing windows. (via Sergio Bossa)
(tags: via:sbtourist events event-processing streaming data ex-iona darach-ennis push-technology cep javascript node.js streams)
Raspberry Pi gets open-source video drivers
'As of right now, all of the VideoCore driver code which runs on the ARM is available under a FOSS license (3-Clause BSD to be precise). If you’re not familiar with the status of open source drivers on ARM SoCs this announcement may not seem like such a big deal, but it does actually mean that the BCM2835 used in the Raspberry Pi is the first ARM-based multimedia SoC with fully-functional, vendor-provided (as opposed to partial, reverse engineered) fully open-source drivers, and that Broadcom is the first vendor to open their mobile GPU drivers up in this way.' This is a great result -- congrats to the Raspberry Pi team for getting this to happen.
(tags: raspberry-pi open-source hardware drivers gpu graphics embedded-linux linux broadcom bsd bcm2835)
experimental CPU-cache-aware hash table implementations in Cloudera's Impala
via Todd Lipcon -- https://twitter.com/tlipcon/status/261113382642532352 'another cool piece of cloudera impala source: cpu-cache-aware hash table implementations by @jackowayed'. 'L1-sized hash table that hopes to use cache well. Each bucket is a chunk list of tuples. Each chunk is a cache line.'
(tags: hashing hash-tables data-structures performance c++ l1 cache cpu)
-
"The cost per element in major data structures offered by Java and Guava (r11)]." A very useful reference!
Ever wondered what's the cost of adding each entry to a HashMap? Or one new element in a TreeSet? Here are the answers: the cost per-entry for each well-known structure in Java and Guava. You can use this to estimate the cost of a structure, like this: if the per-entry cost of a structure is 32 bytes, and your structure contains 1024 elements, the structure's footprint will be around 32 kilobytes. Note that non-tree mutable structures are amortized (adding an element might trigger a resize, and be expensive, otherwise it would be cheap), making the measurement of the "average per element cost" measurement hard, but you can expect that the real answers are close to what is reported below.
(tags: java coding guava reference memory cost performance data-structures)
-
'A continuous version of Conway's Life, using floating point values instead of integers'. 'SmoothLifeL supports many interesting phenomena such as gliders that can travel in any direction, rotating pairs of gliders, 'wickstretchers' and the appearance of elastic tension in the 'cords' that join the blobs.' paper: http://arxiv.org/abs/1111.1567 , and slides: http://www.youtube.com/watch?v=iyTIXRhjXII (via jwz)
(tags: life games emergent-behaviour algorithms graphics via:jwz cool eye-candy conways-life floating-point continuous gliders)
Trident: a high-level abstraction for realtime computation
built on Storm:
Trident is a new high-level abstraction for doing realtime computing on top of Twitter Storm, available in Storm 0.8.0. It allows you to seamlessly mix high throughput (millions of messages per second), stateful stream processing with low latency distributed querying. If you're familiar with high level batch processing tools like Pig or Cascading, the concepts of Trident will be very familiar - Trident has joins, aggregations, grouping, functions, and filters. In addition to these, Trident adds primitives for doing stateful, incremental processing on top of any database or persistence store. Trident has consistent, exactly-once semantics, so it is easy to reason about Trident topologies.
(tags: distributed realtime twitter storm trident distcomp stream-processing low-latency nathan-marz)
-
glitch art, colour separations, etc. (via mlkshk)
Cliff Click's 2008 JavaOne talk about the NonBlockingHashTable
I'm a bit late to this data structure -- highly scalable, nearly lock-free, benchmarks very well (except with the G1 GC): http://edwwang.com/blog/2012/02/10/concurrent-hashmap-benchmark/ . Having said that, it doesn't cope well with frequently-changing unique keys: http://sourceforge.net/tracker/?func=detail&aid=3563980&group_id=194172&atid=948362 . More background at: http://www.azulsystems.com/blog/cliff/2007-03-26-non-blocking-hashtable and http://www.azulsystems.com/blog/cliff/2007-04-01-non-blocking-hashtable-part-2 This was used in Cassandra for a while, although I think the above bug may have caused its removal?
(tags: nonblockinghashtable data-structures hashmap concurrency scaling java jvm)
-
Excellent stuff, by Mary Mulvihill:
Where in Dublin can you see a Victorian diving bell? What about the skeleton of Tommy, the prince’s elephant? The site of the world’s first earthquake experiment? Or the world’s sports pirate radio broadcast? Our new e-book Ingenious Dublin has all these fascinating stories and more. It is packed with information, places to visit, and lots of illustrations, and covers the city and county, from Skerries windmills to Ballybetagh’s fossil deer.'
EUR 4.99 for the Kindle e-book. I'll buy that!(tags: kindle reading books mary-mulvihill science facts dublin ireland history)
Weathering the Unexpected - ACM Queue
Failures happen, and resilience drills help organizations prepare for them.
Good write-up on Google's DiRT (Disaster Recovery Test) procedures, clearly based on Amazon's Gameday exercises. ;) See also http://queue.acm.org/detail.cfm?id=2371297 for a moderated discussion including Jesse Robbins and John Allspaw(tags: game-day tests disaster-recovery dirt exercises history amazon google etsy resilience acm)
-
This is a must-read. One journalist's experience of constant online harassment by an antisemitic internet troll, and their eventual unmasking.
(tags: internet trolling harassment trolls antisemitism stories twitter)
Your Approach to Saving British Newspapers Will Not Work
the bad-anti-spam-idea checklist, repurposed
(tags: checklists via:trish-byrne funny media news newspapers uk ireland)
Facebook monitoring cache with Claspin
reasonably nice heatmap viz for large-scale instance monitoring. I like the "snake" pattern for racks
(tags: facebook monitoring dataviz heatmaps claspin cache memcached ui)
The Oireachtas great leap backwards: it’s not just about KildareStreet.com
'it appears that the Oireachtas has decided to save time and money by eliminating entirely the stage in their workflow that parsed raw debates records into XML. This stage has been replaced with a (presumably automated) process that generates web pages from Lotus Notes. It’s easy to see how somebody with little appreciation of the value of providing open public data in a structured format could have viewed this stage as a costly luxury, and its elimination as a simple and obvious “efficiency”. It’s particularly disappointing, however, that nobody in the decision-making process seemed to be aware of how much of a backward step this “efficiency” would represent. As John Handelaar of KildareStreet.com told The Irish Times, “We are replacing 2012 with 1995 overnight”.'
(tags: kildare-street open-data opengov ireland data oireachtas)
-
Excellent stuff. Using "sljit", a stackless platform-independent JIT compiler, this compiles Perl-compatible regular expressions to machine code on ARM, x86, MIPS and PowerPC platforms, resulting in 'similar matching speed to DFA based engines (like re2) on common patterns' with Perl compatibility. 'This work has been released as part of PCRE 8.20 and above. Now (PCRE 8.31), nearly all PCRE features are supported including UTF-8/16 and partial matching.'
(tags: pcre regexps regex performance optimization jit compilation dfa re2 via:akohli)
Spanner: Google's Globally-Distributed Database [PDF]
Abstract: Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: non-blocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner. To appear in: OSDI'12: Tenth Symposium on Operating System Design and Implementation, Hollywood, CA, October, 2012.
(tags: database distributed google papers toread pdf scalability distcomp transactions cap consistency)
NCBI ROFL: Probably the most horrifying scientific lecture ever
In 1983, at the Urodynamics Society meeting in Las Vegas, Professor G.S. Brindley first announced to the world his experiments on self-injection with papaverine to induce a penile erection. This was the first time that an effective medical therapy for erectile dysfunction (ED) was described, and was a historic development in the management of ED. The way in which this information was first reported was completely unique and memorable, and provides an interesting context for the development of therapies for ED. I was present at this extraordinary lecture, and the details are worth sharing. Although this lecture was given more than 20 years ago, the details have remained fresh in my mind, for reasons which will become obvious.
Go on, guess.(tags: medicine science funny erectile-dysfunction omgwtf conferences)
Yuri Suzuki: London Underground circuit map radio
Japanese designer yuri suzuki has sent designboom images of his 'london underground circuit maps' project developed as part of the designers in residence program at the london design museum, on show until january 13th, 2013. responding to 'thrift' as a theme, suzuki's work explores communication systems in consumer electronics. a printed circuit board (PCB) is used as a precedent for developing a electrical circuit influenced by harry beck's iconic london underground map diagrams. by strategically positioning certain speaker, resistor and battery components throughout the map, users can visually understand the complex networks associated with electricity and how power is generated within a radio.
Beautifully done (via jwz.)(tags: electronics london art design underground travel yuri-suzuki circuitry)
The meanings and origins of ‘feck’
It's a "minced oath", apparently:
'Feck is a popular minced oath in Ireland, occupying ground between the ultra-mild expletive flip and the often taboo (but also popular) fuck. It’s strongly associated with Irish speech, and serves a broad range of linguistic purposes that I’ll address briefly in this post.'
It doesn't derive from the obvious source:
So where does the curse, the not-quite-rude word, come from? It’s commonly assumed to stem from its coarser cousin fuck, the simple vowel change undercutting its power and making it more suitable for public expression. But Julian Walker, an educator at the British Library, offers a more roundabout route: “In faith” becomes the improbable “in faith’s kin” shortened to “i’fackins”, which gradually shrinks to “fac” and “feck”.
(tags: feck swearing ireland irish hiberno-english father-ted etymology cursing)
-
nice concurrent Map data structure for the JVM; beats out ConcurrentHashMap, ConcurrentLinkedHashMap from guava, ConcurrentSkipListMap under both CMS and G1 garbage collectors.
(tags: concurrency benchmarks hashmap map data-structures java jvm snaptree)
Chip and Skim: cloning EMV cards with the pre-play attack
Worrying stuff from the LBT team. ATM RNGs are predictable, and can be spoofed by intermediate parties:
'So far we have performed more than 1000 transactions at more than 20 ATMs and a number of POS terminals, and are collating a data set for statistical analysis. We have developed a passive transaction logger which can be integrated into the substrate of a real bank card, which records up to 100 unpredictable numbers in its EEPROM. Our analysis is ongoing but so far we have established non-uniformity of unpredictable numbers in half of the ATMs we have looked at. First, there is an easier attack than predicting the RNG. Since the unpredictable number is generated by the terminal but the relying party is the issuing bank, any intermediate party – from POS terminal software, to payment switches, or a middleman on the phone line – can intercept and superimpose their own choice of UN. Attacks such as those of Nohl and Roth, and MWR Labs show that POS terminals can be remotely hacked simply by inserting a sabotaged smartcard into the terminal.
(tags: atm banking security attack prngs spoofing banks chip-and-pin emv smartcards)
New UK Conservative Party Co-Chair Grant Shapps Founded Google Spamming Business
Wow. Scummy stuff.
Shapps founded HowToCorp in 2005, a site that, among other products, pitches the TrafficPaymaster software. The software apparently “scrapes” or copies content from all over the web, from RSS feeds to even sets of search results, to automatically generate pages that probably make little sense to the human visitor but which may pick up some traffic from Google and, in turn, generate clicks on Google AdSense or other ads.
Google are not happy: On Sunday sources at Google confirmed TrafficPaymaster was in “violation” of its policies and that its search engine’s algorithms had been equipped to drop the ranking of any webpages created using HowToCorp’s software. Officially, Google said it does not comment on individual cases. “We have strict policies in place to ensure web users are presented with useful ads when browsing sites in our content network and to ensure our advertisers reach an engaged audience. If we are alerted to a site which breaks our AdSense policies, we will review it and can remove it from our network.”(tags: grant-shapps uk politics tories spammers spamming spinning adsense google spam trafficpaymaster)
NunatsiaqOnline 2012-09-06: The First Non-Inuk on the Moon
No, I am not a conspiracy theorist who believes that Armstrong’s moon landing was faked at some mysterious location in the Nevada desert. Armstrong reached the moon. But his accolades are undeserved because he was not first. All right-thinking Nunavummiut know this, because we know that Inuit regularly visited the moon for centuries. David Iqaqrialu said as much in a heated exchange in the Nunavut legislature on May 6, 2002. We know it was heated because he prefaced his remarks by telling the Speaker, “I am starting to get hot under the collar...” He then went on to say, as reported in Hansard, “...it is not really related to the question that I posed, but this is background material. Inuit had reached the moon quite some time ago during the shamanistic ages, prior to the Americans reaching it with their machines and finding out it wasn’t what they thought it was.”
(via Dave Walsh)(tags: inuit via:daev shaman nunavut neil-armstrong moon space exploration)
Dublin City contact numbers for potholes, dangerous drivers, illegal parking etc.
I'm sure these are about as useful as a chocolate teapot, but what the hey
(tags: dublin parking cycling roads safety potholes reporting)
Knots on Mars! (and a few thoughts on NASA's knots)
amazing post from the International Guild of Knot Tyers Forum:
While a few of the folks here are no doubt aware, it might surprise most people to learn that knots tied in cords and thin ribbons have probably traveled on every interplanetary mission ever flown. If human civilization ends tomorrow, interplanetary landers, orbiters, and deep space probes will preserve evidence of both the oldest and newest of human technologies for millions of years. Knots are still used in this high-tech arena because cable lacing has long been the preferred cable management technique in aerospace applications. That it remains so to this day is a testament to the effectiveness of properly chosen knots tied by skilled craftspeople. It also no doubt has a bit to do with the conservative nature of aerospace design and engineering practices. Proven technologies are rarely cast aside unless they no longer fulfill requirements or there is something substantially better available. While the knots used for cable lacing in general can be quite varied -- in some cases even a bit idiosyncratic -- NASA has in-house standards for the knots and methods used on their spacecraft. These are specified in NASA Technical Standard NASA-STD-8739.4 -- Crimping, Interconnecting Cables, Harnesses, and Wiring. As far as I've been able to identify in the rover images below, all of the lacings shown are one of two of the several patterns specified in the standard. The above illustration shows the so-called "Spot Tie". It is a clove hitch topped by two half-knots in the form of a reef (square) knot. In addition to its pure binding role, it is also used to affix cable bundles to tie-down point.
Some amazing scholarship on knot technology in this post -- lots to learn! (via Tony Finch, iirc)(tags: via:fanf mars nasa science knots tying rope cables cabling geek aerospace standards)
Estonia introduces coding classes to 8-year-olds
'ProgreTiiger education will start with students in the first grade, which starts around the age of 7 or 8 for Estonians. The compsci education will continue through a student’s final years of public school, around age 16. Teachers are being trained on the new skills, and private sector IT companies are also getting involved, which makes sense, given that these entities will likely end up being the long-term beneficiaries of a technologically literate populace. The ProgreTiiger program is launching at a few pilot schools and will soon be rolling out to all general education schools in Estonia.'
(tags: estonia education coding programming kids children students learning school)
Avoiding Hash Lookups in a Ruby Implementation
'If I were to sum up the past 6 years I've spent optimizing JRuby it would be with the following phrase: Get Rid Of Hash Lookups.' This has been a particular theme of some recent optimization hacks I've been working on. Hashes may be O(1) to read, on average, but that doesn't necessarily mean they're the right tool for performance... (via Declan McGrath)
(tags: via:declanmcgrath hash optimization ruby performance jruby hashing data-structures big-o optimisation)
River Poddle underneath the city of Dublin's streets
Rarely-seen pictures of Dublin's underground river which runs beneath Dublin Castle. I wonder if these are what those blokes spotted entering the drains were up to
(tags: hidden-dublin ireland dublin history poddle rivers waterways subterrainean)
Striped (Guava: Google Core Libraries for Java 13.0.1 API)
Nice piece of Guava concurrency infrastructure in the latest release:
A striped Lock/Semaphore/ReadWriteLock. This offers the underlying lock striping similar to that of ConcurrentHashMap in a reusable form, and extends it for semaphores and read-write locks. Conceptually, lock striping is the technique of dividing a lock into many stripes, increasing the granularity of a single lock and allowing independent operations to lock different stripes and proceed concurrently, instead of creating contention for a single lock.
The guarantee provided by this class is that equal keys lead to the same lock (or semaphore), i.e. if (key1.equals(key2)) then striped.get(key1) == striped.get(key2) (assuming Object.hashCode() is correctly implemented for the keys). Note that if key1 is not equal to key2, it is not guaranteed that striped.get(key1) != striped.get(key2); the elements might nevertheless be mapped to the same lock. The lower the number of stripes, the higher the probability of this happening.
Prior to this class, one might be tempted to use Map, where K represents the task. This maximizes concurrency by having each unique key mapped to a unique lock, but also maximizes memory footprint. On the other extreme, one could use a single lock for all tasks, which minimizes memory footprint but also minimizes concurrency. Instead of choosing either of these extremes, Striped allows the user to trade between required concurrency and memory footprint. For example, if a set of tasks are CPU-bound, one could easily create a very compact Striped of availableProcessors() * 4 stripes, instead of possibly thousands of locks which could be created in a Map structure. (tags: locking concurrency java guava semaphores coding via:twitter)
HotSpot JVM garbage collection options cheat sheet (v2)
'In this article I have collected a list of options related to GC tuning in JVM. This is not a comprehensive list, I have only collected options which I use in practice (or at least understand why I may want to use them). Compared to previous version a few useful diagnostic options was added. Additionally section for G1 specific options was introduced.'
Martin "Disruptor" Thompson's Single Writer Principle
Contains these millisecond estimates for highly-contended inter-thread signalling when incrementing a 64-bit counter in java:
One Thread300
Undoubtedly not realistic for a lot of cases, but it's still useful for order-of-magnitude estimates of locking cost. Bottom line: don't lock if you can avoid it, even with 'volatile' or AtomicFoo types.
One Thread with Memory Barrier4,700
One Thread with CAS5,700
Two Threads with CAS18,000
One Thread with Lock10,000
Two Threads with Lock118,000
(tags: java jvm performance coding concurrency threading cas locking)
Locks & Condition Variables - Latency Impact
Firstly, this is 3 orders of magnitude greater latency than what I illustrated in the previous article using just memory barriers to signal between threads. This cost comes about because the kernel needs to get involved to arbitrate between the threads for the lock, and then manage the scheduling for the threads to awaken when the condition is signalled. The one-way latency to signal a change is pretty much the same as what is considered current state of the art for network hops between nodes via a switch. It is possible to get ~1µs latency with InfiniBand and less than 5µs with 10GigE and user-space IP stacks. Secondly, the impact is clear when letting the OS choose what CPUs the threads get scheduled on rather than pinning them manually. I've observed this same issue across many use cases whereby Linux, in default configuration for its scheduler, will greatly impact the performance of a low-latency system by scheduling threads on different cores resulting in cache pollution. Windows by default seems to make a better job of this.
(tags: locking concurrency java jvm signalling locks linux threading)
Evolution of SoundCloud's Architecture
nice write-up. nginx, Rails, RabbitMQ, MySQL, Cassandra, Elastic Search, HAProxy
(tags: soundcloud webdev architecture scaling scalability)
What Happens to Stolen Bicycles?
'Bike thievery is essentially a risk-free crime. If you were a criminal, that might just strike your fancy. If Goldman Sachs didn’t have more profitable market inefficencies to exploit, they might be out there arbitraging stolen bikes.' Good summary, and I suspect a lot applies in Dublin too -- flea markets and vanloads of stolen bikes being sent to other cities for reselling.
-
Great (Dublin-focused) writeup on cargo bikes
(tags: cargo-bikes cycling commute kids dutch-bikes bikes)
-
Some good algorithms and notes by Dmitry Vyukov on 'lockfree, waitfree, obstruction-free synchronization algorithms and data structures, scalability-oriented architecture, multicore/multiprocessor design patterns, high-performance computing, threading technologies and libraries (OpenMP, TBB, PPL), message-passing systems and related topics.' The catalog of lock-free queue implementations is particularly extensive (via Sergio Bossa)
(tags: algorithms concurrency articles dmitry-vyukov go c++ coding via:sergio-bossa)
Sting op exposes Andrews over FF Twitter rants - National News - Independent.ie
Incredible sting op uncovers the real identity of an anonymous Twitter account posting Fianna Fail gossip:
He discovered that each tweet had originated from the Twitter web interface, meaning it had been posted from a web browser on a computer, rather than sent from a mobile phone or other portable device. Based on the times that tweets were posted by @brianformerff, he deduced that the Tweets were being posted while the user was on a work break, using a company computer or an internet cafe. The next stage in the hunt was uncovering the IP address of the computer where the tweets originated. "I created my own web redirection service which would allow me to take links to articles of interest, for example in the Irish Times, and then transform them into short links that would pass through a redirection server I controlled. In this way, if someone read the tweets and clicked on the link, I would be able to establish the IP address of the computer that was being used at the time." The author created a new twitter account, @john_cant _type, based on the persona of a politics student based in Kildare. He started sending several messages and tweets to "brian" and other users to establish himself as a genuine twitter user. Eventually @brianformerff responded to a post from @john_cant_type to a link to an article at Silicon Republic. The bait was taken and the IP address was tracked to an internet cafe, Amazon cyber/net Rathmines which offers web access "at the very reasonable rate of €1/hour". What happened next descended almost into the realms of farce. The author waited for tweets from @brianformerff and then rushed to the internet cafe to try and catch Chris Andrews. Eventually the plan worked and the author used photography and video surveillance, even taking covert photographs of tweets as they were being posted in the internet cafe by Chris Andrews and analysing if the word count and structure matched the tweets appearing in cyberspace under the tag @brianformerff.
(tags: chris-andrews twitter surveillance privacy anonymity politics ireland fianna-fail)
-
The Rootbeer GPU Compiler makes it easy to use Graphics Processing Units from within Java. Rootbeer is more advanced that CUDA or OpenCL Java Language Bindings. With bindings the developer must serialize complex graphs of objects into arrays of primitive types. With Rootbeer this is done automatically. Also with language bindings, the developer must write the GPU kernel in CUDA or OpenCL. With Rootbeer a static analysis of the Java Bytecode is done (using Soot) and CUDA code is automatically generated. [...] All of the familar Java code you have been writing can be executed on the GPU.
"In Which The Irish Invent Twitter in 1984"
A fascinating story of 1980s tech history -- 'The initial Text Tell PX-1000 was developed by Text Lite Ltd. in Ireland in the early 1980s, probably in 1983. It allowed people to create simple text messages and send them by phone anywhere in the world. It had a built-in memory that could hold up to 7400 characters. The firmware inside the PX-1000 was written by West-Tec Ltd. in Ireland, who were probably also the hardware manufacturers. [... A later version was] the Philips version of the PX-1000Cr, as it features advanced cryptographic capabilities. It was intended for small companies and journalists, and was also used by the Dutch Government. [...] it played an important role in the fight for Nelson Mandela's release from prison.'
(tags: nelson-mandela ireland history crypto texting text-lite 1980s philips)
French illegal downloads agency Hadopi may be abolished
According to recent statistics, Hadopi has sent 1 million warning emails, 99,000 "strike two" letters and identified 314 people for referral to the courts for possible disconnection. No one has actually been disconnected. According to Aurelie Filipetti, culture minister in the new French Government, Hadopi has been nothing but a waste of money. "€12 million per year and 60 officials; that's an expensive way to send 1 million emails," Filipetti said. "Hadopi has not fulfilled its mission of developing legal downloads. I prefer to reduce the funding of things that have not been proven to be useful."
0 disconnections. Not one.NASA's Mars Rover Crashed Into a DMCA Takedown
An hour or so after Curiosity’s 1.31 a.m. EST landing in Gale Crater, I noticed that the space agency’s main YouTube channel had posted a 13-minute excerpt of the stream. Its title was in an uncharacteristic but completely justified all caps: “NASA LANDS CAR-SIZE ROVER BESIDE MARTIAN MOUNTAIN.” When I returned to the page ten minutes later, [...] the video was gone, replaced with an alien message: “This video contains content from Scripps Local News, who has blocked it on copyright grounds. Sorry about that.” That is to say, a NASA-made public domain video posted on NASA’s official YouTube channel, documenting the landing of a $2.5 billion Mars rover mission paid for with public taxpayer money, was blocked by YouTube because of a copyright claim by a private news service.
(tags: dmca google fail nasa copyright false-positives scripps youtube video mars)
High-frequency trading: The fast and the furious | The Economist
"The NYMEX panel found that Infinium had finished writing the algorithm only the day before it introduced it to the market, and had tested it for only a couple of hours in a simulated trading environment to see how it would perform. The firm's normal testing processes take six to eight weeks. When the algorithm started its frenetic buying spree, the measures designed to shut it down automatically did not work. One was supposed to turn the system off if a maximum order size was breached, but because the machine was placing lots of small orders rather than a single big one the shutdown was not triggered. The other measure was meant to prevent Infinium from selling or buying more than a certain number of contracts, but because of an error in the way the rogue algorithm had been written, this, too, failed to spot a problem."
(tags: hft automation trading markets stocks nymex bugs software)
Lessons in website security anti-patterns by Tesco : Troy Hunt, an Aussie software architect working on a .Net security product called ASafaWeb, does a great job extensively deconstructing Tesco's appalling website security on their shopping site. In the process, he gets this wonderful tweet from their customer-care account: "@troyhunt Let me assure you that all customer passwords are stored securely & in line with industry standards across online retailers." As he says, this is a clear demonstration that Tesco is in the first stage of the four stages of competence -- "unconscious incompetence": "The individual does not understand or know how to do something and does not necessarily recognise the deficit." ( http://en.wikipedia.org/wiki/Four_stages_of_competence )
(tags: tesco security passwords web http https ssl funny dot-net shopping uk customer-care)Accident: Ryanair B738 and American B763 at Barcelona on Apr 14th 2011 : An accident report concerning a Ryanair flight.
An American Airlines Boeing 767-300, registration N366AA performing flight AA-67 from Barcelona,SP (Spain) to New York JFK, NY (USA), had taxied to the holding point runway 25L and was holding short of the runway. A Ryanair Boeing 737-800, registration EI-EKB performing flight FR-8136 from Barcelona,SP (Spain) to Ibiza,SP (Spain) with 169 passengers and 6 crew, was taxiing along Barcelona's taxiway K for departure from runway 25L and was maneouvering to pass behind the Boeing 767-300. A number of passengers on board of the Boeing 737-800 observed the right hand wing of the aircraft contact the tailplane of the Boeing 767-300 and rose out of their seats attracting the attention of a flight attendant. A passenger told the flight attendant, that their aircraft had hit the aircraft besides them. The flight attendant contacted the purser, who instructed her to contact the flight deck, she contacted the flight deck and informed the captain that passengers had seen their aircraft had hit another aircraft. The captain responded however everything was fine and she continued with the takeoff about 2 minutes after the Boeing 767. Immediately after departure the passengers insisted the flight was not safe and they had collided with another aircraft, one of the passengers identified himself as an engineer. The flight attendant told the engineer that the captain had been informed and had told everything was fine. No further information was forwarded to the flight deck. After landing in Ibiza, while disembarking, the passengers again spoke up claiming the flight had been unsafe. During the turnaround the flight attendant informed the purser that one of the passengers observing the collision was an engineer. Neither approached the flight crew however. Following the return flight FR-8137 the purser talked to the captain and informed her that one of the passengers observing the collision was an engineer. In the following it was identified that the right hand winglet of the Boeing 737-800 had received damage, the Boeing 767-300 was found with damage to the left hand stabilizer following landing in New York.
According to the story, it appears the AA flight crew were not informed of the potential damage to their plane before or during their transatlantic flight to JFK. (via Juan Flynn)
(tags: via:juanflynn flight travel safety ryanair collisions)CIAIAC report : The official report on that Ryanair/AA collision in Barcelona in July 2011, on pages 211-255.
(tags: collisions safety travel air ryanair)Practical machine learning tricks from the KDD 2011 best industry paper : Wow, this is a fantastic paper. It's a Google paper on detecting scam/spam ads using machine learning -- but not just that, it's how to build out such a classifier to production scale, and make it operationally resilient, and, indeed, operable. I've come across a few of these ideas before, and I'm happy to say I might have reinvented a few (particularly around the feature space), but all of them together make extremely good sense. If I wind up working on large-scale classification again, this is the first paper I'll go back to. Great info! (via Toby diPasquale.)
(tags: classification via:codeslinger training machine-learning google ops kdd best-practices anti-spam classifiers ensemble map-reduce)
The world’s first 3D-printed gun : I wasn't expecting to see this for a few years. The future is ahead of schedule!
A .22-caliber pistol, formed from a 3D-printed AR-15 (M16) lower receiver, and a normal, commercial upper. In other words, the main body of the gun is plastic, while the chamber — where the bullets are actually struck — is solid metal. [...] While this pistol obviously wasn’t created from scratch using a 3D printer, the interesting thing is that the lower receiver — in a legal sense at least — is what actually constitutes a firearm. Without a lower receiver, the gun would not work; thus, the receiver is the actual legally-controlled part. In short, this means that people without gun licenses — or people who have had their licenses revoked — could print their own lower receiver and build a complete, off-the-books gun. What a chilling thought.
(tags: via:peakscale guns scary future grim-meathook-future 3d-printing thingiverse weapons)
"Are You Human?" urban intervention, 2009 : turn CAPTCHAs into cut-outs, mount them in the urban environment, and they blend into the tag landscape. This came up after contemplating "artisanal integers", and the concept of taking something digital and ephemeral and making a hand-made, long-lived physical artifact from it. (via ted byfield)
(tags: art sculpture captchas physical artifacts tags graffiti human urban)Brooklyn Integers | Integers as a service : Integers artisanally hand-crafted for you. See also the sister site, missionintegers.com: "Each of our bespoke numbers is created just for you in San Francisco’s historic Mission District. What will you use it for? A letter-pressed receipt. A special touch of latte art. A globally-unique user ID. A woolen hat. The possibilities are as infinite as the space of 64-bit unsigned ints." (via John Allspaw)
(tags: via:allspaw humour funny integers artisan satire hand-crafted)
This park's life - The Irish Times - Thu, Jul 26, 2012 : Great article about Dublin's Phoenix Park, Europe's largest enclosed urban park (more than twice the size of New York's Central Park, in fact). Now that I have two little kids, I've been spending a good portion of my weekends there -- it's a wonderful thing to have on our doorstep. Also:
The park even breeds celebrities. “The lion that roars at the start of the MGM movies. He’s a Dub. He was born in Dublin Zoo.”
(tags: phoenix-park dublin history parks deer lion kids)
Universal properties of mythological networks - Abstract - EPL (Europhysics Letters) - IOPscience : Abstract:
As in statistical physics, the concept of universality plays an important, albeit qualitative, role in the field of comparative mythology. Here we apply statistical mechanical tools to analyse the networks underlying three iconic mythological narratives with a view to identifying common and distinguishing quantitative features. Of the three narratives, an Anglo-Saxon and a Greek text are mostly believed by antiquarians to be partly historically based while the third, an Irish epic [jm: "An Táin Bó Cúailnge", The Tain, to be specific], is often considered to be fictional. Here we use network analysis in an attempt to discriminate real from imaginary social networks and place mythological narratives on the spectrum between them. This suggests that the perceived artificiality of the Irish narrative can be traced back to anomalous features associated with six characters. Speculating that these are amalgams of several entities or proxies, renders the plausibility of the Irish text comparable to the others from a network-theoretic point of view.
Here's what the Irish Times said:The society in the 1st century story of the Táin Bó Cúailnge looked artificial at first analysis of the networks between 404 characters in the story. However, the researchers found the society reflected real rather than fictional networks when the weakest links to six of the characters are removed. These six characters included Medb, Queen of Connacht; Conchobor, King of Ulster and Cúchulainn. They were "similar to superheroes of the Marvel universe" and are "too superhuman" or too well-connected to be real, researchers said. The researchers suggest that each of these superhuman characters may be an amalgam of many which became fused and exaggerated as the story was passed down orally through generations.
(tags: networks society the-tain epics history mythology ireland statistics network-analysis papers)Irish campsite recommendations : the conclusion of a Twitter/Facebook recommendations-gathering exercise; winners seem to be Lough Key Forest Park, Renvyle Beach, Fintra, Eagle Point, and Hidden Valley
(tags: camping ireland tips recommendations caravan holidays vacation)
CloudBurst : 'Highly Sensitive Short Read Mapping with MapReduce'. current state of the art in DNA sequence read-mapping algorithms.
CloudBurst uses well-known seed-and-extend algorithms to map reads to a reference genome. It can map reads with any number of differences or mismatches. [..] Given an exact seed, CloudBurst attempts to extend the alignment into an end-to-end alignment with at most k mismatches or differences by either counting mismatches of the two sequences, or with a dynamic programming algorithm to allow for gaps. CloudBurst uses [Hadoop] to catalog and extend the seeds. In the map phase, the map function emits all length-s k-mers from the reference sequences, and all non-overlapping length-s kmers from the reads. In the shuffle phase, read and reference kmers are brought together. In the reduce phase, the seeds are extended into end-to-end alignments. The power of MapReduce and CloudBurst is the map and reduce functions run in parallel over dozens or hundreds of processors.
JM_SOUGHT -- the next generation ;)
(tags: bioinformatics mapreduce hadoop read-alignment dna sequencing sought antispam algorithms)Expensive lessons in Python performance tuning : some good advice for large-scale Python performance: prun and guppy for profiling, namedtuples for memory efficiency, and picloud for trivial EC2-based scale-out. (via Nelson)
(tags: picloud prun guppy namedtuples python optimization performance tuning profiling)On Patents : Notch comes up with a perfect analogy for software patents.
I am mostly fine with the concept of “selling stuff you made”, so I’m also against copyright infringement. I don’t think it’s quite as bad as theft, and I’m not sure it’s good for society that some professions can get paid over and over long after they did the work (say, in the case of a game developer), whereas others need to perform the job over and over to get paid (say, in the case of a hairdresser or a lawyer). But yeah, “selling stuff you made” is good. But there is no way in hell you can convince me that it’s beneficial for society to not share ideas. Ideas are free. They improve on old things, make them better, and this results in all of society being better. Sharing ideas is how we improve. A common argument for patents is that inventors won’t invent unless they can protect their ideas. The problem with this argument is that patents apply even if the infringer came up with the idea independently. If the idea is that easy to think of, why do we need to reward the person who happened to be first?
Of course, in reality it's even worse, since you don't actually have to be first to invent -- just first to file without sufficient people noticing, and people are actively dissuaded from noticing (since it makes their lives riskier if they know about the existence of patents)...
(tags: business legal ip copyright patents notch minecraft patent-trolls)Marsh's Library : Dublin museum of antiquarian books, open to the public -- well worth a visit, apparently (I will definitely be making my way there soon I suspect), to check out their new "Marvels of Science" exhibit. Not only that though, but they have a beautiful website with some great photos -- exemplary
(tags: museum dublin ireland libraries books science)'Poisoning Attacks against Support Vector Machines', Battista Biggio, Blaine Nelson, Pavel Laskov : The perils of auto-training SVMs on unvetted input.
We investigate a family of poisoning attacks against Support Vector Machines (SVM). Such attacks inject specially crafted training data that increases the SVM's test error. Central to the motivation for these attacks is the fact that most learning algorithms assume that their training data comes from a natural or well-behaved distribution. However, this assumption does not generally hold in security-sensitive settings. As we demonstrate, an intelligent adversary can, to some extent, predict the change of the SVM's decision function due to malicious input and use this ability to construct malicious data. The proposed attack uses a gradient ascent strategy in which the gradient is computed based on properties of the SVM's optimal solution. This method can be kernelized and enables the attack to be constructed in the input space even for non-linear kernels. We experimentally demonstrate that our gradient ascent procedure reliably identifies good local maxima of the non-convex validation error surface, which significantly increases the classifier's test error.
Via Alexandre Dulaunoy
(tags: papers svm machine-learning poisoning auto-learning security via:adulau)
C500k in Action at Urban Airship : I missed this back in 2010; 500k active TCP connections to a single EC2 large instance using Java and NIO
(tags: c10k java linux ec2 scaling nio netty urban-airship)GraphChi : "big data, small machine" -- perform computation on very large graphs using an algorithm they're calling Parallel Sliding Windows. similar to Google's Pregel, apparently
(tags: graphs graphchi big-data algorithms parallel)High performance network programming on the JVM, OSCON 2012 : by Erik Onnen of Urban Airship. very good presentation on the current state of the art in large-scale low-latency service operation using the JVM on Linux. Lots of good details on async vs sync, HTTPS/TLS/TCP tuning, etc.
(tags: http https scaling jvm async sync oscon presentations tcp)
Science funding doesn't add up - The Irish Times : '[Science Foundation Ireland] said it was continuing to support basic research, but there are a number of leading scientists here who were refused funding despite having qualified for it in the past. Dr Mike Peardon of the School of Mathematics was recently been turned down, having been “administratively withdrawn”. This means the application for funding was rejected at the first post during initial consideration and before it had a chance to be assessed by external experts. Several others in his department suffered a similar fate. “The school of mathematics at Trinity is ranked the 15th best maths department in the world and now we are not fundable by Science Foundation Ireland,” he said. “The cases I heard of have all been in pure maths,” said Prof Lorraine Hanlon in UCD’s school of physics. “All reported that the people in pure maths were returned unreviewed.” She believes other areas may also come under pressure. “Pure maths is the thin end of the wedge. The Government says mathematics is fundamental, but on the other side says we dont really care enough to support it. That is a schizophrenic approach,” she said.'
(tags: mathematics ireland science research academia funding tcd ucd sfi)Microsoft's ill-chosen magic constants : 'Paolo Bonzini noticed something a little awkward in the Linux kernel support code for Microsoft's HyperV virtualisation environment - specifically, that the magic constant passed through to the hypervisor was "0xB16B00B5", or, in English, "BIG BOOBS". It turns out that this isn't an exception - when the code was originally submitted it also contained "0x0B00B135".' me, I prefer my magic constants less offensive and more Subgenius-oriented: "0xB0BD0BB5"
(tags: constants via:kevin-lyda oh-dear microsoft fail magic-numbers boobs linux kernel)
Scaling lessons learned at Dropbox : website-scaling tips and suggestions, "particularly for a resource-constrained, fast-growing environment that can’t always afford to do things “the right way” (i.e., any real-world engineering project". I really like the "run with fake load" trick; add additional queries/load which you can quickly turn off if the service starts browning out, giving you a few days breathing room to find a real fix before customers start being affected. Neat
(tags: dropbox scalability webdev load scaling-up)
Ansible : 'SSH-Based Configuration Management & Deployment'. deploy via SSH; no target-side daemons required. GPLv3 licensed, unfortunately :(
(tags: ansible devops configuration deployment sysadmin python ssh)
Don’t waste your time in crappy startup jobs : 7 reasons why working for a startup sucks. Been there, done that -- I wish I'd read this years ago. It should be permalinked at the top of Hacker News. "In 1995, a lot of talented young people went into large corporations because they saw no other option in the private sector– when, in fact, there were credible alternatives, startups being a great option. In 2012, a lot of young talent is going into startups for the same reason: a belief that it’s the only legitimate work opportunity for top talent, and that their careers are likely to stagnate if they work in more established businesses. They’re wrong, I think, and this mistaken belief allows them to be taken advantage of. The typical equity offer for a software engineer is dismally short of what he’s giving up in terms of reduced salary, and the career path offered by startups is not always what it’s made out to be. For all this, I don’t intend to argue that people shouldn’t join startups. If the offer’s good, and the job looks interesting, it’s worth trying out. I just don’t think that the current, unconditional “startups are awesome!” mentality serves us well. It’s not good for any of us, because there’s no tyrant worse than a peer selling himself short, and right now there are a lot of great people selling themselves very short for a shot at the “startup experience” -- whatever that is."
(tags: startups work job life career tech vc companies pay stock share-options)
Sean Sherlock to science researchers: "see ya! don't let the door hit you on the way out" : "In relation to the possibility of losing skilled people overseas, any vibrant research ecosystem will see an ebb and flow of capable people in the scientific fields – in some ways this is a good thing, as experience gained abroad has the potential to benefit Ireland in the future. The latest SFI data shows that SFI supports approximately 3,000 researchers, including some 2,000 postgraduate students and post-doctorals -- a figure that has remained relatively stable for some time." NICE
(tags: sean-sherlock jobs ireland science research)'You are shrunk to the height of a nickel and thrown into a blender. Your mass is reduced so that your density is the same as usual. The blades start moving in 60 seconds. What do you do?' : Brilliant responses to this stereotypically-annoying Google interview question: "Since being shrunk down like this is impossible, I can only assume this is happening inside a dream or nightmare of some kind. I sit down and meditate, summoning up my Siddartha/Neo like mental powers and realise that there is no blender, and that this terrible dream was created by the ego of a sadistic Google employee. As the kundalini fire races up my spine, and my spirit is liberated, I open my third eye and bathe said Google employee in the light of love. I forgive him, for he knows not what he does."
(tags: funny interviewing google blenders reddit)
Pretty Penny : Importing the good old Dutch Omafiets lady's bike to Dublin, EUR269 each. that's a great price for a solid bike!
(tags: bikes cycling dublin fashion omafiets dutch)SSTable and Log Structured Storage: LevelDB : good writeup of LevelDB's native storage formats; the Sorted String Table (SSTable), Log Structured Merge Trees, and Snappy compression
(tags: leveldb nosql data storage disk persistence google)
Joyent Services Back After 8 Day Outage : Lest we forget. I think it was 10 days in total once everything was resolved
(tags: joyent outages bingodisk strongspace cloud solaris zfs)A Periodic Table of Visualization Methods : interesting categorisation, and some crazy visualisations I've not encountered before (via Aileen)
(tags: dataviz visualization information design ui via:aileen)Microsoft’s Downfall: Inside the Executive E-mails and Cannibalistic Culture That Felled a Tech Giant : "They had a great lead, they were years ahead. And they completely blew it. And they completely blew it because of the bureaucracy."
(tags: microsoft bureaucracy stack-ranking hr culture)
Facts still sacred despite Ireland's spectrum of conflicting views on abortion - The Irish Times - Fri, Jun 29, 2012 : Very good data-driven analysis. "Pro-life” groups claim abortion is a serious mental health risk for women. Youth Defence claims women who opt for an abortion rather than carrying to term or giving the baby up for adoption suffer mental maladies such as depression, suicide and other problems. But this is at heart a scientific claim, and can thus be tested. [...] Psychologist Dr Brenda Majors studied this in depth and found no evidence that ["post-abortion syndrome"] exists. As long as a woman was not depressive before an abortion, “elective abortion of an unintended pregnancy does not pose a risk to mental health”. The same results were found in several other studies [...] Essentially these studies found there was no difference in mental health between those who opted for abortion and those who carried to term. Curiously, there was a markedly increased risk to mental health for women who gave a child up for adoption. A corollary of the research was that while women did not suffer long-term mental health effects due to abortion, short-term guilt and sadness was far more likely if the women had a background where abortion was viewed negatively or their decisions were decried -- the kind of attitude fostered by “pro-life” activists."
(tags: pro-choice pro-life abortion data facts via:irish-times research science pregnancy depression pas)
"Machine Learning That Matters" [paper, PDF] : Great paper. This point particularly resonates: "It is easy to sit in your o?ce and run a Weka algorithm on a data set you downloaded from the web. It is very hard to identify a problem for which machine learning may o?er a solution, determine what data should be collected, select or extract relevant features, choose an appropriate learning method, select an evaluation method, interpret the results, involve domain experts, publicize the results to the relevant scienti?c community, persuade users to adopt the technique, and (only then) to truly have made a di?erence (see Figure 1). An ML researcher might well feel fatigued or daunted just contemplating this list of activities. However, each one is a necessary component of any research program that seeks to have a real impact on the world outside of machine learning."
(tags: machine-learning ml software data real-world algorithms)Massive identity-theft breach in South Korea results in calls for national ID system to be abandoned : In South Korea, web users are required to provide their national ID number for "virtually every type of Internet activity, not only for encrypted communications like e-commerce, online banking and e-government services but also casual tasks like e-mail and blogging", apparently in an attempt to "curb cyber-bullying". The result is obvious -- those ID numbers being collected in giant databases at companies like "SK Communications, which runs top social networking service Cyworld and search site Nate", and those giant databases being tasty targets for black-hats. Now: "In Korea’s biggest-ever case of data theft the recent hacking attack at SK Communications, which runs top social networking service Cyworld and search site Nate, breached 35 million accounts, a mind-boggling total for a country that has about 50 million people and an economically-active population of 25 million. The compromised information includes names, passwords, phone numbers, e-mail addresses, and most alarmingly, resident registration numbers, the country’s equivalent to social security numbers." This is an identity-fraudster's dream: "In the hands of criminals, resident registration numbers could become master keys that open every door, allowing them to construct an entire identity based on the quality and breadth of data involved."
(tags: south-korea identity fraud identity-theft web bullying authentication hacking)
WeatherSpark : Beatiful dataviz of weather data from met.no, NOAA.gov, World Weather Online and Weather Central. The main graph includes: mean and percentiles of historical temperature data for time of year, the temperature and precipitation forecast over the chosen period, wind direction and speed, with hourly data. Very nicely done! (via Una Mullally)
(tags: via:unamullally dataviz temperature forecasts weather graphing percentiles wind rain)the recruiter honeypot : wow, I thought it was hard hiring in Dublin. Sounds like Silicon Valley is insane. "Unfortunately, it’s not all about the numbers. Though external recruiters perform well for start-ups, there’s another side to this story. It pains me to write this but I think it’s important to share. Meebo employed lots of external recruiters when we were getting off the ground. We had standard 18-month no-poach restrictions with all of our contractors that specified that those recruiters were not allowed to contact Meebo employees within 18 months of our contract expiring. Most of those contracts expired in 2008-2009. However, every recruiter and firm we’d worked with who was still in the recruiting business tried to poach [the 'honeypot' employee] Pete London." (Another lesson: don't build a product in javascript, since it's impossible to hire engineers ;)
(tags: honeypots hiring silicon-valley recruiting coding experts meebo)
CEO Of Internet Provider Sonic.net: We Delete User Logs After Two Weeks. Your Internet Provider Should, Too. - Forbes : "what we saw was a shift towards customers being made part of a business model that involved–I don’t know if extortion is the right word–but embarassment for gain. An individual would download a movie, using bittorrent, and infringe copyright. And that might be our customer, like Bob Smith who owns a Sonic.net account, or it might be their spouse, or it might be their child. Or it might be one of his three roommates in a loft in San Francisco, who Bob is not responsible for, and who rent out their loft on AirBnB and have couch surfers and buddies from college and so on and open Wifi. When lawyers asked us for these users’ information, some of our customers I spoke with said “Oh yeah, crap, they caught me,” and were willing to admit they engaged in piracy and pay a settlement. But in other cases, it turned out the roommate did it, or no one would admit to doing it. But they would pay the settlement anyway. Because no one wants to be named in the public record in a case from So-And-So Productions vs. 1,600 names including Bob Smith for downloading a film called “Don’t Tell My Wife I B—F—— The Babysitter.” AG: Is that a real title? DJ: Yes. I’ve read about cases where a lawyer was doing this for the movie “The Expendables,” and 5% of people settled. So then he switched to representing someone with an embarassing porn title, and like 30% of people paid. It seemed like half the time, the customer wasn’t the one right one, but they rolled over because it would be very embarassing. And I think that’s an abuse of process. I was unwilling to become part of that business model. In many cases the lawyers never pursued the case, and it was all bluster. But under that threat, you pay."
(tags: interview isps freedom copyright internet shakedown lawyers sonic.net data-retention via:oisin)an ex-RBSG engineer on the NatWest/RBS/UlsterBank IT fiasco : 'Turning over your systems support staff in a wave of redundancies is not the best way to manage the transfer of knowledge. Not everyone who worked the batch at [Royal Bank of Scotland Group] even knew what it is they knew; how, then, could they explain it to people who didn’t know there was knowledge to acquire? Outsourcing the work from Edinburgh to Aberdeen and sacking the staff would have exposed them to the same risks. [...] I Y2K tested one of the batch feeder systems at RBS from 1997 - 1998, and managed acceptance testing in payments processing systems from 1999 - 2001. I was one of the people who watched over the first batch of the millennium instead of going to a party. I was part of the project that moved the National Westminster batch onto the RBS software without a single failure. I haven’t worked for the bank for five years, and I am surprised at how personally affronted I am that they let that batch fail. But I shouldn’t be. Protectiveness of the batch was the defining characteristic of our community. We were proud of how well that complex structure of disparate components hummed along. It was a thing of beauty, of art and craft, and they dropped it all over the floor.'
(tags: systems ops support maintainance legacy ca-7 banking rbs natwest ulster-bank fail outsourcing)Some Facts & Insights Into The Whole Discussion Of 'Ethics' And Music Business Models | Techdirt : David "Camper Van Beethoven" Lowery's blogpost about music sales, ethics, piracy etc. looks like it was pretty much riddled with errors regarding the viability of the music business, then and now. Empirical figures from Jeff Price from Tunecore, and others, to debunk it: "'Well here’s some truth about the old industry that David somehow misses. Previously, artists were not rolling in money. Most were not allowed into the system by the gatekeepers. Of those that were allowed on the major labels, over 98% of them failed. Yes, 98%?. Of the 2% that succeeded, less than a half percent of those ever got paid a band royalty from the sale of recorded music. How in the world is an artist making at least something, no matter how small, worse than 99% of the world’s unsigned artists making nothing and of the 1% signed, less than a half a percent of them ever making a single band royalty ever?'" [...] "Another example of Lowery being wrong that Price responds to is the claim that recorded music revenue to artists has been going down. Price has data: 'This is empirically false. Revenue to labels has collapsed. Revenue to artists has gone up with more artists making more money now than at any time in history, off of the sale of pre-recorded music. Taken a step further, a $17.98 list price CD earned a band $1.40 as a band royalty that they only got if they were recouped (over 99% of bands never recouped). If an artist sells just two songs for $0.99 on iTunes via TuneCore, they gross $1.40. If they sell an album for $9.99 on iTunes via TuneCore, they gross $7.00. This is an INCREASE of over 700% in revenue to artists for recorded music sales.'"
(tags: music mp3 music-business piracy techdirt david-lowery tunecore)
Eight Real Tales of Learning Computer Science as a High School Girl : 'All students at Stuyvesant High School are required to take a year of computer science. As it turns out, the advanced computer science classes skew mostly male anyway. But for a year, boys and girls get exposed to computer programming together. We asked Mike Zamansky, the head of the computer science program, to share some stories from his female students. They did us one better. Eight students sent in first-hand accounts of what it’s like to learn computer programming as a teenage girl.' Some interesting comments here. This topic is weighing on my mind now that I have two girls...
(tags: schools learning education computer-science technology nyc girls teenage)RBS collapse details revealed - The Register : as noted in the gossip last week. 'The main batch scheduling software used by RBS is CA-7, said one source, a former RBS employee who left the company recently.' 'RBS do use CA-7 and do update all accounts overnight on a mainframe via thousands of batch jobs scheduled by CA-7 ... Backing out of a failed update to CA-7 really ought to have been a trivial matter for experienced operations and systems programming staff, especially if they knew that an update had been made. That this was not the case tends to imply that the criticisms of the policy to "offshore" also hold some water.'
(tags: outsourcing failure software rbs natwest ulster-bank ulster-blank offshoring downsizing ca-7 upgrades)
Natwest, RBS: When will bank glitch be fixed? Probably not today • The Register Forums : Some amazing insider-info posts on the Reg forum for the gigantic RBS/NatWest/Ulster Bank multi-day outage. Fingers pointing at their outsourcing/downsizing practices -- in a word, they've sacked the experienced staff, replaced them with noobs thousands of miles away, and not paid down any technical debt on the legacy code they're maintaining. Classic legacy IT fail. "I worked for RBS during and after the merger with Natwest, I left their Global Financial Markets Department in 2004 after a 5 year stint. They had already moved some IT functions to India at that point and have continued to do so year on year since. The numbers some people are quoting 1600/800 are possibly the more recent figures, the total is way way beyond this. The comments on documentation are comical, as if a document is the thing you turn to at a time of crisis. The fact is, when you work closely with systems and the business users, you understand not only the quirks of the systems, but the risks and consequences of failure. You work with those users on the work around solutions that will get the banking day complete. They haven't just outsourced the IT staff, but the very experienced and valuable back office / operations staff that would work with IT staff to solve the serious issues. I beleive these guys are mostly posted out in Singapore, who probably have never met the IT staff in India. The unseen cost of outsourcing is a compounding loss of shared experience and commitment, which becomes accutely apparent when the sh!t hits the ... cash machines The chaps I trained out in India were nice enough, but they simply lacked the knowledge and experience of Finacial Markets trading, trade and settlement processing, Swift messaging blah blah and the risks involved. I'll be drinking with a bunch of ex RBS/Natwesties soon enough, where we'll all be saying..... "WE TOLD YOU SO!!!!!!!" Another poster says: "I understand that your description of the RBS Mainframe based batch update process is fairly accurate. The source of the problem was a software update to Batch scheduling suite CA7. The upgrade when so well that now there is no schedule to run all of those thousands of batch jobs to receive and make BACS payments, update balance, schedule printouts, etc. I am sure the problem with the CA7 upgrade and the unfortunate misplacing of the Batch schedule has absolutely nothing to do the with the last UK based technicians leaving recently. The guys in India of course are perfectly able to cope and fix their mistake. I'm sure they understand how the thousands of jobs in the schedule need to ordered to make sure there is data corruption or loss. After all the problem happened on Tuesday and it's only Friday. I wonder how many ex-RBS staff have received very lucrative short term contracts in the last few days......"
(tags: natwest it rbs the-register outsourcing fail organisations ulster-bank ulster-blank)
the VIM clutch : 'VIM Clutch is a hardware pedal for improved text editing speed for users of the magnificent VIM text editor. When the pedal is pressed down, the pedal types "i" causing VIM to go into Insert Mode. When released, it types
and you are back in Normal Mode.' (via Andrew Delaney)
(tags: via:delaney vim programming ui pedals vi modal foot-switch)Irish "Millennials" post more negative reviews than anyone else : 'Millennials are more negative when it comes to product sentiment. They give more 1-star reviews than Gen X or Boomers; the most negative Millennials in our analysis hail from Ireland, where 12% of them give products 1- or 2-star ratings.' Previously, we tended not to complain -- not any more, it seems
(tags: ireland complaints whinging generations ratings reviews via:jim-carroll studies behaviour online opinions)VIDEO: Drone ‘spies’ on Mountjoy, and the Facebook and Google offices · The Daily Edge : TheJournal coverage of that drones-over-the-Aras video. some truly demented comments fearing for "intellectual property", somehow
(tags: the-journal drones video youtube ireland privacy)"Finding the k shortest paths", D. Eppstein, 1994 [paper] : 35th IEEE Symp. Foundations of Comp. Sci., Santa Fe, 1994, pp. 154-165. This paper presents an algorithm that finds multiple short paths connecting two terminals in a graph (allowing repeated vertices and edges in the paths) in constant time per path after a preprocessing stage dominated by a single-source shortest path computation. The paths it finds are the k shortest in the graph, where k is a parameter given as input to the algorithm. Time complexity: O(E*log V + L*k*log k) time (L is path length)
(tags: k-shortest-paths graph algorithms sparse-graphs)"K* : A Directed On-The-Fly Algorithm for Finding the k Shortest Paths", Husain Aljazzar and Stefan Leue, 2008 : "We present a new algorithm, called K*, for ?nding the k shortest paths between a designated pair of vertices in a given directed weighted graph. Compared to Eppstein’s algorithm, which is the most prominent algorithm for solving this problem, K* has two advantages. First, K* performs on-the-?y, which means that it does not require the graph to be explicitly available and stored in main memory. Portions of the graph will be generated as needed. Second, K* is a directed algorithm which enables the use of heuristic functions to guide the search. This leads to signi?cant improvements in the memory and runtime demands for many practical problem instances. We prove the correctness of K* and show that it maintains a worst-case runtime complexity of O(m+k n log(k n)) and a space complexity of O(k n + m), where n is the number of vertices and m is the number of edges of the graph. We provide experimental results which illustrate the scalability of the algorithm."
(tags: graphs k-shortest-paths algorithms papers)Floyd–Warshall algorithm - Wikipedia, the free encyclopedia : "a graph analysis algorithm for finding shortest paths in a weighted graph (with positive or negative edge weights)".
(tags: graphs algorithms k-shortest-paths)
Scram : noun: an emergency shutdown of a nuclear reactor. It has been defined as an acronym for "Safety Control Rod Axe Man", due to this story from Norman Hilberry: "When I showed up on the balcony on that December 2, 1942 afternoon [at the Chicago Pile, the world's first self-sustaining nuclear reactor], I was ushered to the balcony rail, handed a well sharpened fireman's ax and told, "if the safety rods fail to operate, cut that manila rope." The safety rods, needless to say, worked, the rope was not cut... I don't believe I have ever felt quite as foolish as I did then. ...I did not get the SCRAM [Safety Control Rod Axe Man] story until many years after the fact. Then one day one of my fellows who had been on Zinn's construction crew called me Mr. Scram."
(tags: scram nuclear reactor history etymology words shutdown emergency wikipedia 1942 science acronyms)South Lake Union Eats : holy moly, that's a lot of food trucks (SLU is the district hosting Amazon's Seattle campus)
(tags: food food-trucks seattle restaurants slu amazon)A Closer Look: Email-Based Malware Attacks : 'The average detection rate for these samples was 24.47 percent, while the median detection rate was just 19 percent.' That is *atrocious*. (via Tony Finch)
(tags: via:fanf fail malware filtering av smtp email viruses)LOITERING THEATRE - YouTube : 'An excerpt from LOITERING THEATRE by Caroline Campbell and Nina McGowan - a film exploring forbidden and inaccessible space in Dublin through flying drones equipped with cameras.' Mountjoy Prison and Aras an Uachtairain, notably
(tags: dublin forbidden exploration urban-exploration loitering youtube video film ireland drones hack-the-city science-gallery)
The Hydra Bay : "How to set up a Pirate Bay proxy". Step-by-step instructions for MacOS and Linux on how to run a fully-functional reverse proxy for The Pirate Bay -- in other words, provide a duplicate URL for users to circumvent ISP blocks of TPB. http://about.piratereverse.info/proxy/list.html contains about a hundred others. See also http://unblockedpiratebay.com/ for a standalone PHP script which does the same (albeit a little less efficiently). A good demonstration of how futile filtering techniques like IP or domain name blocks are, when applied to a popular website like TPB.
(tags: piratebay filtering censorship copyright php proxies reverse-proxies ip-blocking dns-blocking)how to restore from iCloud backup : the trick: don't try and do it through iTunes, it won't give you the option, apparently. I have a carrier unlock, and apparently need to wipe the phone for it to take place; this scares the crap out of me
(tags: backup iphone restore sysadmin phones icloud apple howto)
Jim FitzPatrick's Pinterest account : *the* Jim FitzPatrick -- he of the legendary iconic 1968 Che Guevara image. "All is free to share, I only go after those who use/misuse/exploit my artwork for profit. Have fun." I particularly like http://pinterest.com/pin/46302702389391106/ ;)
(tags: jim-fitzpatrick open-source sopa acta ireland copyright)Issue of web access raises hackles at conference - The Irish Times - Tue, Jun 19, 2012 : 'Prof Michael O’Flaherty, the vice-chairman of the UN Human Rights Committee, told the Organisation for Security and Co-operation in Europe (OSCE) conference on internet freedom that the rights of copyright holders to make a living had to be balanced with the right to freedom of expression.' 'THE PUNISHMENT for breakers of the “three strikes” illegal download rule was “exceptionally disproportionate” [...] The internet was a vehicle for a wide range of human rights so excluding someone from it was an “extraordinary penalty”.'
(tags: osce coverage unhrc conferences dublin copyright freedom internet censorship filtering)
The story of St. Columba: A modern copyright battle in sixth century Ireland : a good summary of the roots of copyright, the Columcille "To every cow belongs its calf; to every book its copy" story (via TJ McIntyre)
(tags: columcille copyright history ireland columbanus books)
PGP founder, Navy SEALs uncloak encrypted comms biz • The Register : 'The company, called Silent Circle, will launch later this year, when $20 a month will buy you encrypted email, text messages, phone calls, and videoconferencing in a package that looks to be strong enough to have the NSA seriously worried. Zimmermann says that surveillance by the state and others has increased vastly over the last few years, and privacy improvement are again needed. "At the very least I want people, as part of their right in a free society to be able to communicate securely," he said in a promotional video. "I should be able to whisper in your ear, even if your ear is a thousand miles away." [...] While software can handle most of the work, there still needs to be a small backend of servers to handle traffic. The company surveyed the state of privacy laws around the world and found that the top three choices were Switzerland, Iceland, and Canada, so they went for the one within driving distance.'
(tags: pgp phil-zimmermann privacy crypto silent-circle apps vc security)
The Silencing of Maya : software patent shakedown threatens to remove a 4-year-old's only means of verbal expression: 'Maya can speak to us, clearly, for the first time in her life. We are hanging on her every word. We’ve learned that she loves talking about the days of the week, is weirdly interested in the weather, and likes to pretend that her toy princesses are driving the bus to school (sometimes) and to work (other times). This app has not only allowed her to communicate her needs, but her thoughts as well. It’s given us the gift of getting to know our child on a totally different level. I’ve been so busy embracing this new reality and celebrating, that I kind of forgot that there was an ongoing lawsuit, until last Monday. When Speak for Yourself was removed from the iTunes store.'
(tags: speak-for-yourself children law swpats patenting stories ipad apps)_Building High-level Features Using Large Scale Unsupervised Learning_ [paper, PDF] : "We consider the problem of building highlevel, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images using unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also ?nd that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art."
(tags: algorithms machine-learning neural-networks sgd labelling training unlabelled-learning google research papers pdf)
I've been playing the same game of Civilization II for almost 10 years. This is the result. : Epic Reddit post. "Parallels to '1984' off the top of my head: 3 superpowers, a "communist" leadership in which technology has reached as far as it needs to go (end of technology tree), barbarian (resistance) uprisings constantly being stomped out by the totalitarian government, nuclear war rendering most farmland useless, constant breaking and reassembling of treaties between the 3 superpowers, seemingly infinite war (due to the previous point), an ever present and all knowing leader making the decisions of the nation..." (via oceanclub)
(tags: via:oceanclub gaming games civ sid-meier 1984 politics war future strategy)Wrapping parties around a tight budget - The Irish Times - Tue, Jun 12, 2012 : "On the birthday present front, Crowley was shocked when she totted up her family’s spending for six months and realised she was spending more on gifts than on clothes for her entire family. A few 40th birthdays aside, she says the majority of the outlay is on gifts for her children’s friends." A lot of familiar names in this story ;)
(tags: gifts money prices dublin friends irish-times)
Many Niches » Blog Archive » On Working At Amazon : (catching up on old posts) good article from a recent hire, discussing some unusual aspects of the corporate culture
(tags: amazon culture work)Analyzing Flame's MD5 Collision Attack [slides, PDF] : really detailed slide deck by Alex Sotirov, Co-Founder and Chief Scientist, Trail of Bits, Inc. (via Tony Finch) Plenty of security fail by MS, and also: PKI is clearly too hard
(tags: via:fanf flame security malware md5 collisions hashing pki tls ssl microsoft)
Copyfraud - Wikipedia, the free encyclopedia : 'a term coined by Jason Mazzone (Associate Professor of Law at Brooklyn Law School) to describe situations where individuals and institutions illegally claim copyright ownership of the public domain and other breaches of copyright law with little or no oversight by authorities or legal consequence for their actions.' Good term (via Nelson)
(tags: copyright rights ip fraud copyfraud wikipedia words terminology neologisms dmca infringement)
Here's a letter to the editor of The Times, dated 1st June 1864:
TO THE EDITOR OF THE TIMES.
Sir, --- On my arrival home late yesterday evening a "telegram," by "London District Telegraph," addressed in full to me, was put into my hands. It was as follows :--
"Messrs. Gabriel, dentists, 27, Harley-street, Cavendish-square. Until October Messrs. Gabriel's professional attendance at 27, Harley-street, will be 10 till 5."
I have never had any dealings with Messrs. Gabriel, and beg to ask by what right do they disturb me by a telegram which is evidently simply the medium of advertisement? A word from you would, I feel sure, put a stop to this intolerable nuisance. I enclose the telegram, and am,
Your faithful servant,
M.P.
Upper Grosvenor-street, May 30.
(thanks to Tony Finch for the forward)