Smarter testing Java code with Spock Framework
hmm, looks quite nice as a potential next-gen JUnit replacement for unit tests
(tags: java testing bdd tests junit unit-tests spock via:trishagee)
-
‘Baby Friendly Holidays | Child, Toddler & Family Villas | France | Spain | Portugal | Italy’. Joe swears by it, will give it a go next year
(tags: holidays vacation travel europe kids children via:joe)
How the NSA Converts Spoken Words Into Searchable Text – The Intercept
This hits the nail on the head, IMO:
To Phillip Rogaway, a professor of computer science at the University of California, Davis, keyword-search is probably the “least of our problems.” In an email to The Intercept, Rogaway warned that “When the NSA identifies someone as ‘interesting’ based on contemporary NLP methods, it might be that there is no human-understandable explanation as to why beyond: ‘his corpus of discourse resembles those of others whom we thought interesting’; or the conceptual opposite: ‘his discourse looks or sounds different from most people’s.’ If the algorithms NSA computers use to identify threats are too complex for humans to understand, it will be impossible to understand the contours of the surveillance apparatus by which one is judged. All that people will be able to do is to try your best to behave just like everyone else.”
(tags: privacy security gchq nsa surveillance machine-learning liberty future speech nlp pattern-analysis cs)
awslabs/aws-lambda-redshift-loader
Load data into Redshift from S3 buckets using a pre-canned Lambda function. Looks like it may be a good example of production-quality Lambda
-
‘Aerospike offers phenomenal latencies and throughput — but in terms of data safety, its strongest guarantees are similar to Cassandra or Riak in Last-Write-Wins mode. It may be a safe store for immutable data, but updates to a record can be silently discarded in the event of network disruption. Because Aerospike’s timeouts are so aggressive–on the order of milliseconds — even small network hiccups are sufficient to trigger data loss. If you are an Aerospike user, you should not expect “immediate”, “read-committed”, or “ACID consistency”; their marketing material quietly assumes you have a magical network, and I assure you this is not the case. It’s certainly not true in cloud environments, and even well-managed physical datacenters can experience horrible network failures.’
(tags: aerospike outages cap testing jepsen aphyr databases storage reliability)
Justin's Linklog Posts
Emojineering Part 1: Machine Learning for Emoji Trends – Instagram Engineering
Instagram figuring out meanings from Emoji usage contexts using ML. ????
(tags: instagram emoji cool language text internet web speech communication trends machine-learning analysis)
Call me maybe: Elasticsearch 1.5.0
tl;dr: Elasticsearch still hoses data integrity on partition, badly
(tags: elasticsearch reliability data storage safety jepsen testing aphyr partition network-partitions cap)
In the privacy of your own home
I didn’t know about this:
Last spring, as 41,000 runners made their way through the streets of Dublin in the city’s Women’s Mini Marathon, an unassuming redheaded man by the name of Candid Wueest stood on the sidelines with a scanner. He had built it in a couple of hours with $75 worth of parts, and he was using it to surreptitiously pick up data from activity trackers worn on the runners’ wrists. During the race, Wueest managed to collect personal info from 563 racers, including their names, addresses, and passwords, as well as the unique IDs of the devices they were carrying.
(tags: dublin candid-wueest privacy data marathon running iot activity-trackers)
David P. Reed on the history of UDP
‘UDP was actually “designed” in 30 minutes on a blackboard when we decided pull the original TCP protocol apart into TCP and IP, and created UDP on top of IP as an alternative for multiplexing and demultiplexing IP datagrams inside a host among the various host processes or tasks. But it was a placeholder that enabled all the non-virtual-circuit protocols since then to be invented, including encapsulation, RTP, DNS, …, without having to negotiate for permission either to define a new protocol or to extend TCP by adding “features”.’
(tags: udp ip tcp networking internet dpr history protocols)
Oops: Instagram forgot to renew its SSL certificate
hooray for cert renewal pain
(tags: certs ssl renewal expiry instagram outages lifecycle web https)
-
Seth Vargo is correct. Its not the bit length of the key which is at issue, its the signature algorithm. The entire keychain for the s3.awsamazon.com key is signed with SHA1withRSA: https://www.ssllabs.com/ssltest/analyze.html?d=s3.amazonaws.com&s=54.231.244.0&hideResults=on At issue is that the root verisign key has been marked as weak because of SHA1 and taken out of the curl bundle which is widely popular, and this issue will continue to cause more and more issues going forwards as that bundle makes it way into shipping o/s distributions and aws certification verification breaks.
‘This is still happening and curl is now failing on my machine causing all sorts of fun issues (including breaking CocoaPods that are using S3 for storage).’ — @jmhodges This may be a contributory factor to the issue @nelson saw: https://nelsonslog.wordpress.com/2015/04/28/cyberduck-is-responsible-for-my-bad-ssl-certificate/ Curl’s ca-certs bundle is also used by Node: https://github.com/joyent/node/issues/8894 and doubtless many other apps and packages. Here’s a mailing list thread discussing the issue: http://curl.haxx.se/mail/archive-2014-10/0066.html — looks like the curl team aren’t too bothered about it.(tags: curl s3 amazon aws ssl tls certs sha1 rsa key-length security cacerts)
Cassandra moving to using G1 as the default recommended GC implementation
This is a big indicator that G1 is ready for primetime. CMS has long been the go-to GC for production usage, but requires careful, complex hand-tuning — if G1 is getting to a stage where it’s just a case of giving it enough RAM, that’d be great. Also, looks like it’ll be the JDK9 default: https://twitter.com/shipilev/status/593175793255219200
(tags: cassandra tuning ops g1gc cms gc java jvm production performance memory)
-
ThisIsColossal now have a shop! bookmarking for some lovely gifts
Eight lessons learned hacking on GitHub Pages for six months
Pages is actually pretty solid — nice one GitHub
-
Static code analysis for shell scripts (via Tony Finch)
-
presentation from last week’s Craft Conference in Budapest; Tammer Saleh of Pivotal with a few antipatterns observed in dealing with microservices.
(tags: microservices soa architecture design coding software presentations slides tammer-saleh pivotal craft)
-
‘a command line tool that (hopefully) makes it easier to deploy, update, and test functions for AWS Lambda.’ much needed IMO — Lambda is too closed
-
HashiCorp’s take on the secrets-storage system. looks good
(tags: hashicorp deployment security secrets authentication vault storage keys key-rotation)
Everything Science Knows Right Now About Standing Desks | Co.Design
“Overall, current evidence suggests that both standing and treadmill desks may be effective in improving overall health considering both physiological and mental health components.”
(tags: standing-desks treadmill-desks desks exercise health work workplace back sitting standing)
Race conditions on Facebook, DigitalOcean and others
good trick — exploit eventual consistency and a lack of distributed transactions by launching race-condition-based attacks
(tags: attacks exploits race-conditions bugs eventual-consistency distributed-transactions http facebook digitalocean via:aphyr)
-
‘Discover and discuss the best dev tools and cloud infrastructure services’ — fun!
(tags: stackshare architecture stack ops software ranking open-source)
-
a web-based SSH console that centrally manages administrative access to systems. Web-based administration is combined with management and distribution of user’s public SSH keys. Key management and administration is based on profiles assigned to defined users. Administrators can login using two-factor authentication with FreeOTP or Google Authenticator . From there they can create and manage public SSH keys or connect to their assigned systems through a web-shell. Commands can be shared across shells to make patching easier and eliminate redundant command execution.
32-bit overflow in BitGo js code caused an accidental 85 BTC transaction fee
Yes, this is a fucking 32-bit integer overflow. Whatever software was used, it calculated the sum of all inputs using 32-bit variables, which overflow at about 20 BTC if signed or 40 BTC if not. The fee was supposed to be 0xC350 = 50,000 satoshis, but it turned out to be 0x2,0000,C350 = 8,589,984,592 satoshis. Captains of the industry. If they were captains of any other industry, like say for example automotive, we’d have people dying in car crashes between two stationary vehicles.
(tags: bitcoin fail bitgo javascript bugs 32-bit overflow btc)
Eight Docker Development Patterns
good Docker tips
(tags: tips docker ops deployment)
-
We hope this report helps to round out the overall facts known about this attack. It also demonstrates that collectively there is a lot of visibility into what happens on the web. At the HTTP level seen by Safe Browsing, we cannot confidently attribute this attack to anyone. However, it makes it clear that hiding such attacks from detailed analysis after the fact is difficult. Had the entire web already moved to encrypted traffic via TLS, such an injection attack would not have been possible. This provides further motivation for transitioning the web to encrypted and integrity-protected communication. Unfortunately, defending against such an attack is not easy for website operators. In this case, the attack Javascript requests web resources sequentially and slowing down responses might have helped with reducing the overall attack traffic. Another hope is that the external visibility of this attack will serve as a deterrent in the future.
Via Nelson.(tags: google security via:nelson ddos javascript tls ssl safe-browsing networking china greatfire)
Amazon EC2 Container Service team AmA
a few answers here. Mostly people pointing out shortcomings and the team asking them to start a thread on their forum though :(
Cluster-Based Architectures Using Docker and Amazon EC2 Container Service
In this post, we’re going to take a deeper dive into the architectural concepts underlying cluster computing using container management frameworks such as ECS. We will show how these frameworks effectively abstract the low-level resources such as CPU, memory, and storage, allowing for highly efficient usage of the nodes in a compute cluster. Building on some of the concepts detailed in the earlier posts, we will discover why containers are such a good fit for this type of abstraction, and how the Amazon EC2 Container Service fits into the larger ecosystem of cluster management frameworks.
(tags: docker aws ecs ec2 ops hosting containers mesos clusters)
-
‘Here are four Kubernetes features that came from our experiences with Borg.’
(tags: google ops kubernetes borg containers docker networking)
attacks using U+202E – RIGHT-TO-LEFT OVERRIDE
Security implications of in-band signalling strikes again, 43 years after the “Blue Box” hit the mainstream. Jamie McCarthy on Twitter: “.@cmdrtaco – Remember when we had to block the U+202E code point in Slashdot comments to stop siht ekil stnemmoc? https://t.co/TcHxKkx9Oo” See also http://krebsonsecurity.com/2011/09/right-to-left-override-aids-email-attacks/ — GMail was vulnerable too; and http://en.wikipedia.org/wiki/Unicode_control_characters for more inline control chars. http://unicode.org/reports/tr36/#Bidirectional_Text_Spoofing has some official recommendations from the Unicode consortium on dealing with bidi override chars.
(tags: security attacks rlo unicode control-characters codepoints bidi text gmail slashdot sanitization input)
Meet the e-voting machine so easy to hack, it will take your breath away | Ars Technica
The AVS WinVote system — mind-bogglingly shitty security.
If an election was held using the AVS WinVote, and it wasn’t hacked, it was only because no one tried. The vulnerabilities were so severe, and so trivial to exploit, that anyone with even a modicum of training could have succeeded. They didn’t need to be in the polling place—within a few hundred feet (e.g., in the parking lot) is easy, and within a half mile with a rudimentary antenna built using a Pringles can. Further, there are no logs or other records that would indicate if such a thing ever happened, so if an election was hacked any time in the past, we will never know. I’ve been in the security field for 30 years, and it takes a lot to surprise me. But the VITA report really shocked me—as bad as I thought the problems were likely to be, VITA’s five-page report showed that they were far worse. And the WinVote system was so fragile that it hardly took any effort. While the report does not state how much effort went into the investigation, my estimation based on the description is that it was less than a person week.
(tags: security voting via:johnke winvote avs shoup wep wifi windows)
‘Continuous Deployment: The Dirty Details’
Good slide deck from Etsy’s Mike Brittain regarding their CD setup. Some interesting little-known details: Slide 41: database schema changes are not CD’d — they go out on “Schema change Thursdays”. Slide 44: only the webapp is CD’d — PHP, Apache, memcache components (Etsy.com, support and back-office tools, developer API, gearman async worker queues). The external “services” are not — databases, Solr/JVM search (rolling restarts), photo storage (filters, proxy cache, S3), payments (PCI-DSS, controlled access). They avoid schema changes and breaking changes using an approach they call “non-breaking expansions” — expose new version in a service interface; support multiple versions in the consumer. Example from slides 50-63, based around a database schema migration. Slide 66: “dev flags” (rollout oriented) are promoted to “feature flags” (long lived degradation control). Slide 71: some architectural philosophies: deploying is cheap; releasing is cheap; gathering data should be cheap too; treat first iterations as experiments. Slide 102: “Canary pools”. They have multiple pools of users for testing in production — the staff pool, users who have opted in to see prototypes/beta stuff, 0-100% gradual phased rollout.
(tags: cd deploy etsy slides migrations database schema ops ci version-control feature-flags)
Etsy’s Release Management process
Good info on how Etsy use their Deployinator tool, end-to-end. Slide 11: git SHA is visible for each env, allowing easy verification of what code is deployed. Slide 14: Code is deployed to “princess” staging env while CI tests are running; no need to wait for unit/CI tests to complete. Slide 23: smoke tests of pre-prod “princess” (complete after 8 mins elapsed). Slide 31: dashboard link for deployed code is posted during deploy; post-release prod smoke tests are run by Jenkins. (short ones! they complete in 42 seconds)
(tags: deployment etsy deploy deployinator princess staging ops testing devops smoke-tests production jenkins)
Makerbot’s Saddest Hour | TechCrunch
I’ve been speaking to a few people [at Makerbot] who prefer to remain anonymous and most of my contacts there are gone (the head of PR was apparently fired) and don’t want to talk. But the new from inside is troubling. The mass-layoffs are blamed on low revenue and one former employee wrote “Company was failing. Couldn’t pay vendors, had to downsize.” Do I think Makerbot will sink? At this point I don’t know.
(tags: makerbot 3d-printing startups downsizing layoffs ouch)
-
‘CredStash is a very simple, easy to use credential management and distribution system that uses AWS Key Management System (KMS) for key wrapping and master-key storage, and DynamoDB for credential storage and sharing.’
(tags: aws credstash python security keys key-management secrets kms)
ferd.ca -> Lessons Learned while Working on Large-Scale Server Software
Good advice
(tags: distributed scalability systems coding server-side erlang devops networking reliability)
Internet Scale Services Checklist
good aspirational checklist, inspired heavily by James Hamilton’s seminal 2007 paper, “On Designing And Deploying Internet-Scale Services”
(tags: james-hamilton checklists ops internet-scale architecture operability monitoring reliability availability uptime aspirations)
FBI admits flaws in hair analysis over decades
Wow, this is staggering.
The Justice Department and FBI have formally acknowledged that nearly every examiner in an elite FBI forensic unit gave flawed testimony in almost all trials in which they offered evidence against criminal defendants over more than a two-decade period before 2000. [….] The review confirmed that FBI experts systematically testified to the near-certainty of “matches” of crime-scene hairs to defendants, backing their claims by citing incomplete or misleading statistics drawn from their case work. In reality, there is no accepted research on how often hair from different people may appear the same. Since 2000, the lab has used visual hair comparison to rule out someone as a possible source of hair or in combination with more accurate DNA testing. Warnings about the problem have been mounting. In 2002, the FBI reported that its own DNA testing found that examiners reported false hair matches more than 11 percent of the time.
(tags: fbi false-positives hair dna biometrics trials justice experts crime forensics inaccuracy csi)
-
Most or all of the missing bitcoins were stolen straight out of the MtGox hot wallet over time, beginning in late 2011. As a result, MtGox operated at fractional reserve for years (knowingly or not), and was practically depleted of bitcoins by 2013. A significant number of stolen bitcoins were deposited onto various exchanges, including MtGox itself, and probably sold for cash (which at the bitcoin prices of the day would have been substantially less than the hundreds of millions of dollars they were worth at the time of MtGox’s collapse). MtGox’ bitcoins continuously went missing over time, but at a decreasing pace. Again by the middle of 2013, the curve goes more or less flat, matching the hypothesis that by that time there may not have been any more bitcoins left to lose. The rate of loss otherwise seems unusually smooth and at the same time not strictly relative to any readily available factors such as remaining BTC holdings, transaction volumes or the BTC price. Worth pointing out is that, thanks to having matched up most of the deposit/withdrawal log earlier, we can at this point at least rule out the possibility of any large-scale fake deposits — the bitcoins going into MtGox were real, meaning the discrepancy was likely rather caused by bitcoins leaving MtGox without going through valid withdrawals.
(tags: mtgox bitcoin security fail currency theft crime btc)
Bank of the Underworld – The Atlantic
Prosecutors analyzed approximately 500 of Liberty Reserve’s biggest accounts, which constituted 44 percent of its business. The government contends that 32 of these accounts were connected to the sale of stolen credit cards and 117 were used by Ponzi-scheme operators. All of this activity flourished, prosecutors said, because Liberty Reserve made no real effort to monitor its users for criminal behavior. What’s more, records showed that one of the company’s top tech experts, Mark Marmilev, who was also arrested, appeared to have promoted Liberty Reserve in chat rooms devoted to Ponzi schemes.
(via Nelson)(tags: scams fraud crime currency the-atlantic liberty-reserve ponzi-schemes costa-rica arthur-budovsky banking anonymity cryptocurrency money-laundering carding)
I was a Lampedusa refugee. Here’s my story of fleeing Libya – and surviving
‘The boy next to me fell to the floor and for a moment I didn’t know if he had fainted or was dead – then I saw that he was covering his eyes so he didn’t have to see the waves any more. A pregnant woman vomited and started screaming. Below deck, people were shouting that they couldn’t breathe, so the men in charge of the boat went down and started beating them. By the time we saw a rescue helicopter, two days after our boat had left Libya with 250 passengers on board, some people were already dead – flung into the sea by the waves, or suffocated downstairs in the dark.’
(tags: lampedusa migration asylum europe fortress-europe italy politics immigration libya refugees)
Run your own high-end cloud gaming service on EC2
Using Steam streaming and EC2 g2.2xlarge spot instances — ‘comes out to around $0.52/hr’. That’s pretty compelling IMO
(tags: aws ec2 gaming games graphics spot-instances hacks windows steam)
Running Arbitrary Executables in AWS Lambda
actually an officially-supported mode. huh
(tags: lambda aws architecture ops node.js javascript unix linux)
Exclusive: Chopra says ECB’s threats to Ireland were ‘outrageous’ – Independent.ie
The letters urged the then-government to commit to structural reforms and restructuring of the financial sector. “That is not their job,” Mr Chopra said. “Their mandate is to meet inflation. And if you lecture the ECB as to how they might go about that, they talk about their independence. “But when it comes to lecturing others about fiscal policy or structural policy, they’re not at all hesitant. I’m not surprised that the people in Ireland were very upset about these letters from [Jean-Claude] Trichet.”
(tags: trichet banking ireland politics ajai-chopra ecb history)
Writing Minecraft Plugins – The Book
wow, Walter Higgins’ book (from Peachpit Press) is looking great
(tags: books reading minecraft walter-higgins javascript)
-
Pinterest’s Hadoop workflow manager; ‘scalable, reliable, simple, extensible’ apparently. Hopefully it allows upgrades of a workflow component without breaking an existing run in progress, like LinkedIn’s Azkaban does :(
(tags: python pinterest hadoop workflows ops pinball big-data scheduling)
HACKERS COULD COMMANDEER NEW PLANES THROUGH PASSENGER WI-FI
Boeing 787 Dreamliner jets, as well as Airbus A350 and A380 aircraft, have Wi-Fi passenger networks that use the same network as the avionics systems of the planes
What the fucking fuck. Air-gap or gtfo(tags: air-gap security planes boeing a380 a350 dreamliner networking firewalls avionics)
Tips for debugging EC2 Container Service
some basic ECS tips from Gilt
_Blade: a Data Center Garbage Collector_
Essentially, add a central GC scheduler to improve tail latencies in a cluster, by taking instances out of the pool to perform slow GC activity instead of letting them impact live operations. I’ve been toying with this idea for a while, nice to see a solid paper about it
(tags: gc latency tail-latencies papers blade go java scheduling clustering load-balancing low-latency performance)
SCADA systems online, and a horror story about a non-airgapped Boeing 747 engine management system
747’s are big flying Unix hosts. At the time, the engine management system on this particular airline was Solaris based. The patching was well behind and they used telnet as SSH broke the menus and the budget did not extend to fixing this. The engineers could actually access the engine management system of a 747 in route. If issues are noted, they can re-tune the engine in air. The issue here is that all that separated the engine control systems and the open network was NAT based filters. There were (and as far as I know this is true today), no extrusion controls. They filter incoming traffic, but all outgoing traffic is allowed.
(via Paddy Benson)-
Nice, simple “build a website” platform. Keeping this one bookmarked for the next time someone non-techie asks me for the simplest way to do just that (thanks for the tip, Oisin)
(tags: via:oisin blog cms design hosting web-design web websites)
Extracting Structured Data From Recipes Using Conditional Random Fields
nice probabilistic/ML approach to recipe parsing
(tags: nytimes recipes parsing text nlp machine-learning probabilistic crf++ algorithms feature-extraction)
Large-scale cluster management at Google with Borg
Google’s Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines. It achieves high utilization by combining admission control, efficient task-packing, over-commitment, and machine sharing with process-level performance isolation. It supports high-availability applications with runtime features that minimize fault-recovery time, and scheduling policies that reduce the probability of correlated failures. Borg simplifies life for its users by offering a declarative job specification language, name service integration, real-time job monitoring, and tools to analyze and simulate system behavior. We present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it.
(via Conall)(tags: via:conall clustering google papers scale to-read borg cluster-management deployment packing reliability redundancy)
Keeping Your Car Safe From Electronic Thieves – NYTimes.com
In a normal scenario, when you walk up to a car with a keyless entry and try the door handle, the car wirelessly calls out for your key so you don’t have to press any buttons to get inside. If the key calls back, the door unlocks. But the keyless system is capable of searching for a key only within a couple of feet. Mr. Danev said that when the teenage girl turned on her device, it amplified the distance that the car can search, which then allowed my car to talk to my key, which happened to be sitting about 50 feet away, on the kitchen counter. And just like that, open sesame.
What the hell — who designed a system that would auto-unlock based on signal strength alone?!!(tags: security fail cars keys signal proximity keyless-entry prius toyota crime amplification power-amplifiers 3db keyless)
Closed access means people die
‘We’ve paid 100 BILLION USD over the last 10 years to “publish” science and medicine. Ebola is a massive systems failure.’ See also https://www.techdirt.com/articles/20150409/17514230608/dont-think-open-access-is-important-it-might-have-prevented-much-ebola-outbreak.shtml : ‘The conventional wisdom among public health authorities is that the Ebola virus, which killed at least 10,000 people in Liberia, Sierra Leone and Guinea, was a new phenomenon, not seen in West Africa before 2013. […] But, as the team discovered, that “conventional wisdom” was wrong. In fact, they found a bunch of studies, buried behind research paywalls, that revealed that there was significant evidence of antibodies to the Ebola virus in Liberia and in other nearby nations. There was one from 1982 that noted: “medical personnel in Liberian health centers should be aware of the possibility that they may come across active cases and thus be prepared to avoid nosocomial epidemics.”
(tags: deaths liberia ebola open-access papers elsevier science medicine reprints)
Making Pinterest — Learn to stop using shiny new things and love MySQL
‘The third reason people go for shiny is because older tech isn’t advertised as aggressively as newer tech. The younger companies needs to differentiate from the old guard and be bolder, more passionate and promise to fulfill your wildest dreams. But most new tech sales pitches aren’t generally forthright about their many failure modes. In our early days, we fell into this third trap. We had a lot of growing pains as we scaled the architecture. The most vocal and excited database companies kept coming to us saying they’d solve all of our scalability problems. But nobody told us of the virtues of MySQL, probably because MySQL just works, and people know about it.’ It’s true! — I’m still a happy MySQL user for some use cases, particularly read-mostly relational configuration data…
(tags: mysql storage databases reliability pinterest architecture)
Microservices and elastic resource pools with Amazon EC2 Container Service
interesting approach to working around ECS’ shortcomings — bit specific to Hailo’s microservices arch and IPC mechanism though. aside: I like their version numbering scheme: ISO-8601, YYYYMMDDHHMMSS. keep it simple!
(tags: versioning microservices hailo aws ec2 ecs docker containers scheduling allocation deployment provisioning qos)
Please Kill Me (Eventually) | Motherboard
There is much that the wise application of technology can do to help us ease off this mortal coil, instead of tormenting ourselves at the natural end of life in a futile, undignified and excruciating attempt to keep it somehow duct-taped on. Train more people in geriatrics, for example. Learn new ways to make life safe, healthy, fun and interesting for the old. Think like a community, a brotherhood, not like atomized competing individuals a few of whom can somehow “beat the system” of the universe. Maybe it is better to examine clearly what we are with a view to understanding and acceptance than it is to try to escape what perhaps should be our inevitable ending.
(tags: death mortality cryogenics alcor geriatrics life singularity mind-uploading ray-kurzweil)
CGA in 1024 Colors – a New Mode: the Illustrated Guide
awesome hackery. brings me back to my C=64 demo days
-
‘a secret management and distribution service [from Square] that is now available for everyone. Keywhiz helps us with infrastructure secrets, including TLS certificates and keys, GPG keyrings, symmetric keys, database credentials, API tokens, and SSH keys for external services — and even some non-secrets like TLS trust stores. Automation with Keywhiz allows us to seamlessly distribute and generate the necessary secrets for our services, which provides a consistent and secure environment, and ultimately helps us ship faster. […] Keywhiz has been extremely useful to Square. It’s supported both widespread internal use of cryptography and a dynamic microservice architecture. Initially, Keywhiz use decoupled many amalgamations of configuration from secret content, which made secrets more secure and configuration more accessible. Over time, improvements have led to engineers not even realizing Keywhiz is there. It just works. Please check it out.’
(tags: square security ops keys pki key-distribution key-rotation fuse linux deployment secrets keywhiz)
Bigcommerce Status Page blasts IBM Softlayer Object Storage service
This is pretty heavy stuff:
Bigcommerce engineers have been very pro-active in working with our storage provider, IBM Softlayer, in finding solutions. Unfortunately, it takes two parties to come to a solution. In this case, IBM Softlayer intentionally let their Object Storage cluster fall into disrepair and chose not to scale it. This has impacted Bigcommerce, IBM and many other Softlayer customers. Our engineers placed too much trust in IBM Softlayer and that’s on us. However, the catastrophic failures to see metrics and rapidly scale capacity, the decisions to let hard drives sit at 90% utilization for weeks and months, the cascading failures of an undersized cluster of 52 nodes for the busiest data center in their business speaks to IBM Softlayer’s lack of concern for their customers. We found this out 3 days ago.
(via Oisin)(tags: softlayer bigcommerce outages shambles ibm fail object-storage storage iaas cloud)
Subscribing AWS Lambda Function To SNS Topic With aws-cli
how to use the AWS command line tools to do this
Yelp Product & Engineering Blog | True Zero Downtime HAProxy Reloads
Using tc and qdisc to delay SYNs while haproxy restarts. Definitely feels like on-host NAT between 2 haproxy processes would be cleaner and easier though!
(tags: linux networking hacks yelp haproxy uptime reliability tcp tc qdisc ops)
-
Upsides of this new AWS service: * great UI and visualisations. * solid choice of metric to evaluate the results. Maybe things moved on since I was working on it, but the use of AUC, false positives and false negatives was pretty new when I was working on it. (er, 10 years ago!) Downsides: * it could do with more support for unsupervised learning algorithms. Supervised learning means you need to provide training data, which in itself can be hard work. My experience with logistic regression in the past is that it requires very accurate training data, too — its tolerance for misclassified training examples is poor. * Also, in my experience, 80% of the hard work of using ML algorithms is writing good tokenisation and feature extraction algorithms. I don’t see any help for that here unfortunately. (probably not that surprising as it requires really detailed knowledge of the input data to know what classes can be abbreviated into a single class, etc.)
(tags: amazon aws ml machine-learning auc data-science)
Rob Pike’s 5 rules of optimization
these are great. I’ve run into rule #3 (“fancy algorithms are slow when n is small, and n is usually small”) several times…
(tags: twitter rob-pike via:igrigorik coding rules laws optimization performance algorithms data-structures aphorisms)
AWS Lambda Event-Driven Architecture With Amazon SNS
Any message posted to an SNS topic can trigger the execution of custom code you have written, but you don’t have to maintain any infrastructure to keep that code available to listen for those events and you don’t have to pay for any infrastructure when the code is not being run. This is, in my opinion, the first time that Amazon can truly say that AWS Lambda is event-driven, as we now have a central, independent, event management system (SNS) where any authorized entity can trigger the event (post a message to a topic) and any authorized AWS Lambda function can listen for the event, and neither has to know about the other.
(tags: aws ec2 lambda sns events cep event-processing coding cloud hacks eric-hammond)
Texting at the wheel kills more US teenagers every year than drink-driving
Texting while behind the wheel has overtaken drink driving as the biggest cause of death among teenagers in America. More than 3,000 teenagers are killed every year in car crashes caused by texting while driving compared to 2,700 from drink driving. The study by Cohen Children’s Medical Center also discovered that 50 per cent of students admit to texting while driving.
(tags: texting sms us driving car-safety safety drink-driving)
-
Conducting such a widespread attack clearly demonstrates the weaponization of the Chinese Internet to co-opt arbitrary computers across the web and outside of China to achieve China’s policy ends. The repurposing of the devices of unwitting users in foreign jurisdictions for covert attacks in the interests of one country’s national priorities is a dangerous precedent — contrary to international norms and in violation of widespread domestic laws prohibiting the unauthorized use of computing and networked systems.
(tags: censorship ddos internet security china great-cannon citizen-lab reports web)
-
How to build an Intelligent Personal Assistant: ‘Sirius is an open end-to-end standalone speech and vision based intelligent personal assistant (IPA) similar to Apple’s Siri, Google’s Google Now, Microsoft’s Cortana, and Amazon’s Echo. Sirius implements the core functionalities of an IPA including speech recognition, image matching, natural language processing and a question-and-answer system. Sirius is developed by Clarity Lab at the University of Michigan. Sirius is published at the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2015.’
(tags: sirius siri cortana google-now echo ok-google ipa assistants search video audio speech papers clarity nlp wikipedia)
Why We Will Not Be Registering easyDNS.SUCKS – blog.easydns.org
If you’re not immersed in the naming business you may find the jargon in it hard to understand. The basic upshot is this: the IPC believes that the mechanisms that were enacted to protect trademark holders during the deluge of new TLD rollouts are being gamed by the .SUCKS TLD operator to extort inflated fees from trademark holders.
(via Nelson)(tags: shakedown business internet domains dns easydns dot-sucks scams tlds trademarks ip)
Data privacy is as important as tax, Google exec warns Noonan – Independent.ie
Yep, that would be Google requesting more regulation ;)
(tags: google regulation ireland privacy data-protection)
Russia just made a ton of Internet memes illegal – The Washington Post
In post-Soviet Russia, you don’t make memes. Memes make (or unmake?) you. That is, at least, the only conclusion we can draw from an announcement made this week by Russia’s three-year-old media agency/Internet censor Roskomnadzor, which made it illegal to publish any Internet meme that depicts a public figure in a way that has nothing to do with his “personality.”
(tags: memes photoshop russia freedom web internet funny humour roskomnadzor censorship sad-keanu)
-
‘Utilities that help bridge the gap between Java 8 and Google Guava. Guava has the {@link FluentIterable} concept which is similar to streams. In many ways, fluent iterable is nicer, because it directly binds to the immutable collection classes. However, on balance it seems wise to use the stream API rather than {@code FluentIterable} in Java 8.’
(tags: guava java-8 java fluentiterable streams fluent coding)
-
I like the sound of this — automated Java CMS GC tuning, kind of like a free version of JClarity’s Censum (via Miguel Ángel Pastor)
J. G. Ballard predicted social media in a 1977 essay for Vogue
‘In the intro essay to High Rise it says that J G Ballard predicted social media in a 1977 essay for Vogue. Here it is’
(tags: j-g-ballard social-media twitter instagram youtube future society vogue 1977 facebook media)
Hacked French network exposed its own passwords during TV interview
lols
(tags: passwords post-its fail tv5monde authentication security tv funny)
RADStack – an open source Lambda Architecture built on Druid, Kafka and Samza
‘In this paper we presented the RADStack, a collection of complementary technologies that can be used together to power interactive analytic applications. The key pieces of the stack are Kafka, Samza, Hadoop, and Druid. Druid is designed for exploratory analytics and is optimized for low latency data exploration, aggregation, and ingestion, and is well suited for OLAP workflows. Samza and Hadoop complement Druid and add data processing functionality, and Kafka enables high throughput event delivery.’
(tags: druid samza kafka streaming cep lambda-architecture architecture hadoop big-data olap)
-
an asynchronous Netty based graphite proxy. It protects Graphite from the herds of clients by minimizing context switches and interrupts; by batching and aggregating metrics. Gruffalo also allows you to replicate metrics between Graphite installations for DR scenarios, for example. Gruffalo can easily handle a massive amount of traffic, and thus increase your metrics delivery system availability. At Outbrain, we currently handle over 1700 concurrent connections, and over 2M metrics per minute per instance.
(tags: graphite backpressure metrics outbrain netty proxies gruffalo ops)
Privacy Security Talk in TOG – 22nd April @ 7pm – FREE
Dublin is lucky enough to have great speakers pass through town on occasion and on Wednesday the 22nd April 2015, Runa A. Sandvik (@runasand) and Per Thorsheim (@thorsheim) have kindly offered to speak in TOG from 7pm. The format for the evening is a general meet and greet, but both speakers have offered to give a presentation on a topic of their choice. Anyone one interested in privacy, security, journalism, Tor and/or has previously attended a CryptoParty would be wise to attend. Doors are from 7pm and bring any projects with you you would like to share with other attendees. This is a free event, open to the public and no need to book. See you Wednesday. Runa A. Sandvik is an independent privacy and security researcher, working at the intersection of technology, law and policy. She contributes to The Tor Project, writes for Forbes, and is a technical advisor to both the Freedom of the Press Foundation and the TrueCrypt Audit project. Per Thorsheim as founder/organizer of PasswordsCon.org, his topic of choice is of course passwords, but in a much bigger context than most people imagine. Passwords, pins, biometrics, 2-factor authentication, security/usability and all the way into surveillance and protecting your health, kids and life itself.
(tags: privacy security runa-sandvik per-thorsheim passwords tor truecrypt tog via:oisin events dublin)
-
‘NSW officials seemed more interested in protecting their reputations than the integrity of elections. They sharply criticized Halderman and Teague, rather than commending them, for their discovery of the FREAK attack vulnerability. The Chief Information Officer of the Electoral Commission, Ian Brightwell, claimed Halderman and Teague’s discovery was part of efforts by “well-funded, well-managed anti-internet voting lobby groups,” an apparent reference to our friends at VerifiedVoting.org, where Halderman and Teague are voluntary Advisory Board members.1 Yet at the same time, Brightwell concluded that it was indeed possible that votes were manipulated.’
(tags: freak security vulnerabilities exploits nsw australia internet-voting vvat voting online-voting eff)
Sheets of Glass Cut into Layered Ocean Waves by Ben Young
I particularly love “Rough Waters” — amazing stuff from this kiwi artist
Working Time, Knowledge Work and Post-Industrial Society: Unpredictable Work – Aileen O’Carroll
my friend Aileen has written a book — looks interesting:
I will argue that a key feature of working time within high-tech industries is unpredictability, which alters the way time is experienced and perceived. It affects all aspects of time, from working hours to work organisation, to career, to the distinction between work and life. Although many desire variety in work and the ability to control working hours, unpredictability causes dissatisfaction.
On Amazon.co.uk at: http://www.amazon.co.uk/Working-Time-Knowledge-Post-Industrial-Society-ebook/dp/B00VILIN4U(tags: books reading time work society tech working-hours job life sociology)
Introducing Vector: Netflix’s On-Host Performance Monitoring Tool
It gives pinpoint real-time performance metric visibility to engineers working on specific hosts — basically sending back system-level performance data to their browser, where a client-side renderer turns it into a usable dashboard. Essentially the idea is to replace having to ssh onto instances, run “top”, systat, iostat, and so on.
(tags: vector netflix performance monitoring sysstat top iostat netstat metrics ops dashboards real-time linux)
When S3’s eventual consistency is REALLY eventual
a consistency outage in S3 last year, resulting in about 40 objects failing read-after-write consistency for a duration of about 23 hours
(tags: s3 eventual-consistency aws consistency read-after-writes bugs outages stackdriver)
What is maximum Amazon S3 replication time on file upload? – Stack Overflow
Netflix note a 7 hour consistency delay
(tags: netflix aws s3 consistency eventual-consistency bugs outages)
S3’s “s3-external-1.amazonaws.com” endpoint
public documentation of how to work around the legacy S3 multi-region replication behaviour in North America
(tags: aws s3 eventual-consistency consistency us-east replication workarounds legacy)
A collection of links for streaming algorithms and data structures
Good link-list from Debasish Ghosh
(tags: algorithms streaming big-data streams hll probabilistic data-structures frequency counting sketches cuckoo-filters bloom-filters minhash count-min)
(SEC307) Building a DDoS-Resilient Architecture with AWS
good slides on a “web application firewall” proxy service, deployable as an auto-scaling EC2 unit
(tags: ec2 aws ddos security resilience slides reinvent firewalls http elb)
Germanwings flight 4U9525: what’s it like to listen to a black box recording?
After every air disaster, finding the black box recorder becomes the first priority – but for the crash investigators who have to listen to the tapes of people’s final moments, the experience can be incredibly harrowing.
(tags: flight disasters metrics recording germanwings air-travel black-box-recorder flight-data-recorder death)
Small claims triumph as aerial photographer routs flagrant infringers
This is great news. Flagrant copyright infringement of an aerial photograph penalised to the order of UKP 2,716
(tags: copyright infringement small-claims law uk webb-aviation photography images)
Bad data PR: how the NSPCC sunk to a new low in data churnalism
when the NSPCC sent out a press release saying that one in ten 12-13 year olds [in the UK] are worried that they are addicted to porn and 12% have participated in sexually explicit videos, dozens of journalists appear to have simply played along – despite there being no report and little explanation of where the figures came from. [….] “It turns out the study was conducted by a “creative market research” [ie. pay-per-survey] group calledOnePoll. “Generate content and news angles with a OnePoll PR survey, and secure exposure for your brand,” reads the company’s blurb. “Our PR survey team can help draft questions, find news angles, design infographics, write and distribute your story.” “The OnePoll survey included just 11 multiple-choice questions, which could be filled in online. Children were recruited via their parents, who were already signed up to OnePoll.”
The NSPCC spends 25 million UKP per year on “child protection advice and awareness”, so they have the money to do this right. Disappointing.(tags: nspcc bad-science bad-data methodology surveys porn uk kids addiction onepoll pr market-research)
Stack Overflow Developer Survey 2015
wow, 52.5% of developers prefer a dark IDE theme?!
(tags: coding jobs work careers software stack-overflow surveys)
Gil Tene’s “usual suspects” to reduce system-level hiccups/latency jitters in a Linux system
Based on empirical evidence (across many tens of sites thus far) and note-comparing with others, I use a list of “usual suspects” that I blame whenever they are not set to my liking and system-level hiccups are detected. Getting these settings right from the start often saves a bunch of playing around (and no, there is no “priority” to this – you should set them all right before looking for more advice…).
(tags: performance latency hiccups gil-tene tuning mechanical-sympathy hyperthreading linux ops)
-
I think that materiality means what it says, and if people or algorithms do dumb things with trivial information that’s their problem. But markets are a lot faster and more literal than they were when the materiality standard was created, and I wonder whether regulators or courts will one day decide that materiality is too reasonable a standard for modern markets. The materiality standard depends on the reasonable investor, and in many important contexts the reasonable investor has been replaced by a computer.
(tags: algorithms trading stock stock-market sec materiality april-fools-day tesla investing jokes)
Time Series Metrics with Cassandra
slides from Chris Maxwell of Ubiquiti Networks describing what he had to do to get cyanite on Cassandra handling 30k metrics per second; an experimental “Date-tiered compaction” mode from Spotify was essential from the sounds of it. Very complex :(
(tags: cassandra spotify date-tiered-compaction metrics graphite cyanite chris-maxwell time-series-data)
-
you can use 2-liter carbonated drink bottles to build an inexpensive, reusable water rocket. The thrill factor is surprisingly high, and you can fly them all day long for the cost of a little air and water. It’s the perfect thing for those times when you just want to head down to the local soccer field and shoot off some rockets!
Outages, PostMortems, and Human Error 101
Good basic pres from John Allspaw, covering the basics of tier-one tech incident response — defining the 5 severity levels; root cause analysis techniques (to Five-Whys or not); and the importance of service metrics
(tags: devops monitoring ops five-whys allspaw slides etsy codeascraft incident-response incidents severity root-cause postmortems outages reliability techops tier-one-support)
Twitter’s new anti-harassment filter
Twitter is calling it a “quality filter,” and it’s been rolling out to verified users running Twitter’s iOS app since last week. It appears to work much like a spam filter, except instead of hiding bots and copy-paste marketers, it screens “threats, offensive language, [and] duplicate content” out of your notifications feed.
via Nelson(tags: via:nelson harassment spam twitter gamergame abuse ml)
5% of Google visitors have ad-injecting malware installed
Ad injectors were detected on all operating systems (Mac and Windows), and web browsers (Chrome, Firefox, IE) that were included in our test. More than 5% of people visiting Google sites have at least one ad injector installed. Within that group, half have at least two injectors installed and nearly one-third have at least four installed.
via Nelson.(tags: via:nelson ads google chrome ad-injectors malware scummy)
-
The horrors of monkey-patching:
I call out the Honeybadger gem specifically because was the most recent time I’d been bit by a seemingly good thing promoted in the community: monkey patching third party code. Now I don’t fault Honeybadger for making their product this way. It provides their customers with direct business value: “just require ‘honeybadger’ and you’re done!” I don’t agree with this sort of practice. [….] I distrust everything [in Ruby] but a small set of libraries I’ve personally vetted or are authored by people I respect. Why is this important? Without a certain level of scrutiny you will introduce odd and hard to reproduce bugs. This is especially important because Ruby offers you absolutely zero guarantee whatever the state your program is when a given method is dispatched. Constants are not constants. Methods can be redefined at run time. Someone could have written a time sensitive monkey patch to randomly undefined methods from anything in ObjectSpace because they can. This example is so horribly bad that no one should every do, but the programming language allows this. Much worse, this code be arbitrarily inject by some transitive dependency (do you even know what yours are?).
(tags: ruby monkey-patching coding reliability bugs dependencies libraries honeybadger sinatra)
Science is in crisis and scientists have lost confidence in Government policy
Excellent op-ed from Dr David McConnell, fellow emeritus of TCD’s Smurfit Institute of Genetics: ‘Ireland should once again foster, by competition, a good number of experienced, reputable people, of all ages, who have ideas about solving major scientific questions. These people are an essential part of the foundation of our science-based economy and society. Too many of them are no longer eligible for funding by SFI; too few are being appointed by the universities; and fewer PhDs are being awarded. The writing is on the wall.’
Salutin’ Putin: inside a Russian troll house | World news | The Guardian
file under grim meathook future
(tags: grim-meathook-future guardian russia trolls social-media media censorship livejournal ideology social-control)
-
As a result of a joint investigation of the events surrounding this incident by Google and CNNIC, we have decided that the CNNIC Root and EV CAs will no longer be recognized in Google products.
(tags: cnnic certs ssl tls security certificates pki chrome google)
Llamasoft 8-bit game images now available for download
legal! go Jeff Minter
(tags: jeff-minter llamasoft yaks games history c=64 commodore vic-20 emulation via:shane)
Cassandra remote code execution hole (CVE-2015-0225)
Ah now lads.
Under its default configuration, Cassandra binds an unauthenticated JMX/RMI interface to all network interfaces. As RMI is an API for the transport and remote execution of serialized Java, anyone with access to this interface can execute arbitrary code as the running user.
The Definitive Guide to the Music of The Big Lebowski | LA Weekly
definitive! (via Shero)
(tags: via:shero music the-big-lebowski la-weekly the-dude movies soundtracks)
Reactive Programming for a demanding world
“building event-driven and responsive applications with RxJava”, slides by Mario Fusco. Good info on practical Rx usage in Java
(tags: rxjava rx reactive coding backpressure streams observables)
Chinese authorities compromise millions in cyberattacks
“[The] Great Firewall [of China] has switched from being a passive, inbound filter to being an active and aggressive outbound one.”
(tags: china great-firewall censorship cyberwarfare github ddos baidu future)
Avro, mail # dev – bytes and fixed handling in Python implementation – 2014-09-04, 22:54
More Avro trouble with “bytes” fields! Avoid using “bytes” fields in Avro if you plan to interoperate with either of the Python implementations; they both fail to marshal them into JSON format correctly. This is the official “avro” library, which produces UTF-8 errors when a non-UTF-8 byte is encountered
tebeka / fastavro / issues / #11 – fastavro breaks dumping binary fixed [4] — Bitbucket
The Python “fastavro” library cannot correctly render “bytes” fields. This is a bug, and the maintainer is acting in a really crappy manner in this thread. Avoid this library
(tags: fastavro fail bugs utf-8 bytes encoding asshats open-source python)
A Team of Biohackers Has Figured Out How to Inject Your Eyeballs With Night Vision
Did it work? Yes. It started with shapes, hung about 10 meters away. “I’m talking like the size of my hand,” Licina says. Before long, they were able to do longer distances, recognizing symbols and identifying moving subjects against different backgrounds. “The other test, we had people go stand in the woods,” he says. “At 50 meters, we could figure out where they were, even if they were standing up against a tree.” Each time, Licina had a 100% success rate. The control group, without being dosed with Ce6, only got them right a third of the time.
Well, that’s some risky biohacking. wow(tags: biohacking scary night-vision eyes chlorin-e6 infravision sfm)
Tim Bray on one year as an xoogler
Seems pretty insightful; particularly “I do think the Internet economy would be better and more humane if it didn’t have a single white-hot highly-overprivileged center. Also, sooner or later that’ll stop scaling. Can’t happen too soon.”
(tags: google tim-bray via:nelson xoogler funding tech privacy ads internet)
How I doubled my Internet speed with OpenWRT
File under “silly network hacks”:
Comcast has an initiative called Xfinity WiFi. When you rent a cable modem/router combo from Comcast (as one of my nearby neighbors apparently does), in addition to broadcasting your own WiFi network, it is kind enough to also broadcast “xfinitywifi,” a second “hotspot” network metered separately from your own.
By using his Buffalo WZR-HP-AG300H router’s extra radio, he can load-balance across both his own paid-for connection, and the XFinity WiFi free one. ;)(tags: comcast diy networking openwrt routing home-network hacks xfinity-wifi buffalo)
Unlocking the Power of Stable Teams with Twitter’s SVP of Engineering – First Round Review
Huh. we do this in Swrve — we call them “feature teams”
(tags: feature-team culture development teams coding twitter work teamwork)
How We Scale VividCortex’s Backend Systems – High Scalability
Excellent post from Baron Schwartz about their large-scale, 1-second-granularity time series database storage system
(tags: time-series tsd storage mysql sql baron-schwartz ops performance scalability scaling go)
-
if (creation && object of art && algorithm && one's own algorithm) { include * an algorist * } elseif (!creation || !object of art || !algorithm || !one's own algorithm) { exclude * not an algorist * }
(tags: algorism algorithm art algorists via:belongio)
Nelson’s advice on basic stock option questions
Good advice, and short
(tags: stock share-options shares stock-options via:nelson employment jobs compensation)
-
Race conditions, and errors at startup, seem to be particularly problematic
(tags: race-conditions startup bugs failure fault-tolerance hbase redis reliability ops papers concurrency exception-handling cassandra hdfs mapreduce)
You Cannot Have Exactly-Once Delivery
Cut out and keep:
Within the context of a distributed system, you cannot have exactly-once message delivery. Web browser and server? Distributed. Server and database? Distributed. Server and message queue? Distributed. You cannot have exactly-once delivery semantics in any of these situations.
(tags: distributed distcomp exactly-once-delivery networking outages network-partitions byzantine-generals reference)
What’s confusing about Kafka: a list
At a recent call, Neha said “The most confusing behavior we have is how producing to a topic can return errors for few seconds after the topic was already created”. As she said that, I remembered that indeed, this was once very confusing, but then I got used to it. Which got us thinking: What other things that Kafka does are very confusing to new users, but we got so used to them that we no longer even see the issue?
-
This is the second part of our guide on streaming data and Apache Kafka. In part one I talked about the uses for real-time data streams and explained our idea of a stream data platform. The remainder of this guide will contain specific advice on how to go about building a stream data platform in your organization.
tl;dr: limit the number of Kafka clusters; use Avro.(tags: architecture kafka storage streaming event-processing avro schema confluent best-practices tips)
The Four Month Bug: JVM statistics cause garbage collection pauses (evanjones.ca)
Ugh, tying GC safepoints to disk I/O? bad idea:
The JVM by default exports statistics by mmap-ing a file in /tmp (hsperfdata). On Linux, modifying a mmap-ed file can block until disk I/O completes, which can be hundreds of milliseconds. Since the JVM modifies these statistics during garbage collection and safepoints, this causes pauses that are hundreds of milliseconds long. To reduce worst-case pause latencies, add the -XX:+PerfDisableSharedMem JVM flag to disable this feature. This will break tools that read this file, like jstat.
Gradle Team Perspective on Bazel
interesting.
(tags: gradle bazel build dependencies compilation coding java)
(SDD401) Amazon Elastic MapReduce Deep Dive and Best Practices
good slides for EMR tuning from re:Invent 2014
-
LOL. grepping commit logs for /bug|fix/ does the job, apparently:
In the literature, Rahman et al. found that a very cheap algorithm actually performs almost as well as some very expensive bug-prediction algorithms. They found that simply ranking files by the number of times they’ve been changed with a bug-fixing commit (i.e. a commit which fixes a bug) will find the hot spots in a code base. Simple! This matches our intuition: if a file keeps requiring bug-fixes, it must be a hot spot because developers are clearly struggling with it.
(tags: bugs rahman-algorithm heuristics source-code-analysis coding algorithms google static-code-analysis version-control)
Build in the Cloud: Accessing Source Code
Google reinvented ClearCase
Cross-Region Replication for Amazon S3
Amazing it took so long
(tags: s3 replication cross-region inter-region aws storage)
ECJ case debates EU citizens’ right to privacy
The US wields secretive and indiscriminate powers to collect data, he said, and had never offered Brussels any commitments to guarantee EU privacy standards for its citizens’ data. On the contrary, said [Max Schrems’ counsel] Mr Hoffmann, “Safe Harbour” provisions could be overruled by US domestic law at any time. Thus he asked the court for a full judicial review of the “illegal” Safe Harbour principles which, he said, violated the essence of privacy and left EU citizens “effectively stripped of any protection”. [Irish] DPC counsel Paul Anthony McDermott SC suggested that Mr Schrems had not been harmed in any way by the status quo. “This is not surprising, given that the NSA isn’t currently interested in the essays of law students in Austria,” he said. Mr Travers for Mr Schrems disagreed, saying “the breach of the right to privacy is itself the harm”.
(tags: ireland dpc data-protection privacy eu ec ecj law rights safe-harbour)
EU-US data pact skewered in court hearing
A lawyer for the European Commission told an EU judge on Tuesday (24 March) he should close his Facebook page if he wants to stop the US snooping on him, in what amounts to an admission that Safe Harbour, an EU-US data protection pact, doesn’t work.
(tags: safe-harbour privacy data-protection ecj eu ec surveillance facebook nsa gchq)
devbook/README.md at master · barsoom/devbook
How to avoid the shitty behaviour of ActiveRecord wrt migration safety, particularly around removing/renaming columns. ugh, ActiveRecord
(tags: activerecord fail rails mysql sql migrations databases schemas releasing)
Papa’s Maze 2.0: a father’s beautifully intricate puzzle for his daughter
Working in a similar fashion – drawing small portions each day – it took Mr. Nomura about 2 months to complete his new maze. And in our humble opinion, we think it’s actually just as beautiful, if not more. It’s not quite as dense and the crisper lines make it easier to perceive the interesting patterns that the maze forms. It’s stunning in graphic quality but it’s also a functioning solvable maze, just like its predecessor. Say hello to Papa’s Maze 2.0. It’s available as a print for $30.
The official REST Proxy for Kafka
The REST Proxy is an open source HTTP-based proxy for your Kafka cluster. The API supports many interactions with your cluster, including producing and consuming messages and accessing cluster metadata such as the set of topics and mapping of partitions to brokers. Just as with Kafka, it can work with arbitrary binary data, but also includes first-class support for Avro and integrates well with Confluent’s Schema Registry. And it is scalable, designed to be deployed in clusters and work with a variety of load balancing solutions. We built the REST Proxy first and foremost to meet the growing demands of many organizations that want to use Kafka, but also want more freedom to select languages beyond those for which stable native clients exist today. However, it also includes functionality beyond traditional clients, making it useful for building tools for managing your Kafka cluster. See the documentation for a more detailed description of the included features.
(tags: kafka rest proxies http confluent queues messaging streams architecture)
-
‘Caffeine is a Java 8 based concurrency library that provides specialized data structures, such as a high performance cache.’
(tags: cache java8 java guava caching concurrency data-structures coding)
Combining static model checking with dynamic enforcement using the Statecall Policy Language
This looks quite nice — a model-checker “for regular programmers”. Example model for ping(1):
01 automaton ping (int max_count, int count, bool can_timeout) { 02 Initialize; 03 during { 04 count = 0; 05 do { 06 Transmit_Ping; 07 either { 08 Receive_Ping; 09 } or (can_timeout) { 10 Timeout_Ping; 11 }; 12 count = count + 1; 13 } until (count >= max_count); 14 } handle { 15 SIGINFO; 16 Print_Summary; 17 };
(tags: ping model-checking models formal-methods verification static dynamic coding debugging testing distcomp papers)
-
good review
(tags: cdt replication distcomp voldemort dynamo riak storage papers)
-
Google open sources a key part of their internal build system (internally called “Blaze” it seems for a while). Very nice indeed!
(tags: blaze bazel build-tools building open-source google coding packaging)
-
a Nix-based continuous build system, released under the terms of the GNU GPLv3 or (at your option) any later version. It continuously checks out sources of software projects from version management systems to build, test and release them. The build tasks are described using Nix expressions. This allows a Hydra build task to specify all the dependencies needed to build or test a project. It supports a number of operating systems, such as various GNU/Linux flavours, Mac OS X, and Windows.
-
“tees” all TCP traffic from one server to another. “widely used by companies in China”!
(tags: testing benchmarking performance tcp ip tcpcopy tee china regression-testing stress-testing ops)
Managing private Nix packages outside the Nixpkgs tree
Useful for private-repo Nix usage
Top 10 AWS Security Best Practices: #6 – Rotate all the Keys Regularly
Good doc on how to perform key rotation in AWS
[Nix-dev] Pulling a programs source code from a git repo
Nix supports building from git sha. excellent
Transparent huge pages implicated in Redis OOM
A nasty real-world prod error scenario worsened by THPs:
jemalloc(3) extensively uses madvise(2) to notify the operating system that it’s done with a range of memory which it had previously malloc’ed. The page size on this machine is 2MB because transparent huge pages are in use. As such, a lot of the memory which is being marked with madvise(…, MADV_DONTNEED) is within substantially smaller ranges than 2MB. This means that the operating system never was able to evict pages which had ranges marked as MADV_DONTNEED because the entire page has to be unneeded to allow a page to be reused. Despite initially looking like a leak, the operating system itself was unable to free memory because of madvise(2) and transparent huge pages. This led to sustained memory pressure on the machine and redis-server eventually getting OOM killed.
(tags: oom-killer oom linux ops thp jemalloc huge-pages madvise redis memory)
AllCrypt hacked, via PHP, WordPress, and the marketing director’s email
critical flaw: gaining access to the MySQL db let the attacker manipulate account balances. oh dear
-
‘inspires kids to explore and learn about science, engineering, and technology—and have fun doing it. Every month, a new crate to help kids develop a tinkering mindset and creative problem solving skills.’ aimed at ages 9-14+
(tags: kids gifts tinkering stem education fun engineering science toys)
-
Some nice performance tricks; I particularly like the use of sljit:
Ag uses Pthreads to take advantage of multiple CPU cores and search files in parallel. Files are mmap()ed instead of read into a buffer. Literal string searching uses Boyer-Moore strstr. Regex searching uses PCRE’s JIT compiler (if Ag is built with PCRE >=8.21). Ag calls pcre_study() before executing the same regex on every file. Instead of calling fnmatch() on every pattern in your ignore files, non-regex patterns are loaded into arrays and binary searched.
(tags: jit cli grep search ack ag unix pcre sljit boyer-moore tools)
Richard Stallman’s GNU Manifesto Turns Thirty
nice New Yorker profile of rms
-
Thought-provoking article looking back to John Perry Barlow’s “A Declaration of the Independence of Cyberspace”, published in 1996:
Barlow once wrote that “trusting the government with your privacy is like having a Peeping Tom install your window blinds.” But the Barlovian focus on government overreach leaves its author and other libertarians blind to the same encroachments on our autonomy from the private sector. The bold and romantic techno-utopian ideals of “A Declaration” no longer need to be fought for, because they’re already gone.
(tags: john-perry-barlow 1990s history cyberspace internet surveillance privacy data-protection libertarianism utopian manifestos)
The Terrible Technical Interview
TechCrunch, very down on the traditional big-O-and-whiteboard tech interview. See also https://news.ycombinator.com/item?id=9243169 for some good comments at HN. To be honest I think a good comprehension of data structures and big-O is pretty vital though….
(tags: interviewing jobs management hr hiring techcrunch)