Skip to content

Justin's Linklog Posts

Links for 2014-06-27

  • Sandymount Repair Cafe

    ‘A repair café brings together people with things that need fixin’ with people who have the skills to fix them in a social cafe style environment. It is an effort to move away from the throwaway culture that prevailed at the end of the twentieth century and move towards a more sustainable and enlightened approach to our relationship with consumer goods. Repair cafes are self organising events at a community level run by local volunteers with the support of local community groups, local agencies and other interested organisations. They are not-for-profit but not anti-profit and an important part of their goal is to promote local repair businesses and initiatives. www.repaircafe.ie is the online hub of a network of repair cafés across Ireland.’ Sounds interesting: https://twitter.com/DubCityCouncil/status/481777655445204992 says they’ll be doing it tomorrow from 2-5pm in Sandymount in Dublin.

    (tags: dublin sandymount repair fixing diy frugality repaircafe hardware)

  • Chef Vault

    A way to securely store secrets (auth details, API keys, etc.) in Chef

    (tags: chef storage knife authorisation api-keys security encryption)

  • Amazon EC2 Service Limits Report Now Available

    ‘designed to make it easier for you to view and manage your limits for Amazon EC2 by providing the latest information on service limits and links to quickly request limit increases. EC2 Service Limits Report displays all your service limit information in one place to help you avoid encountering limits on future EC2, EBS, Auto Scaling, and VPC usage.’

    (tags: aws ec2 vpc ebs autoscaling limits ops)

  • Delivery Notifications for Simple Email Service

    Today we are enhancing SES with the addition of delivery notifications. You can now elect to receive an Amazon SNS notification each time SES successfully delivers a message to a recipient’s email server. These notifications give you increased visibility into the mail delivery process. With today’s release, you can now track deliveries, bounces, and complaints, all via notification to the SNS topic or topics of your choice.

    (tags: delivery email smtp ses aws sns notifications ops)

  • How Emoji Get Lost In Translation

    I recently texted a friend to say how I was excited to meet her new boyfriend, and, because “excited” doesn’t look so exciting on an iPhone screen, I editorialized with what seemed then like an innocent “[dancer]”. (Translation: Can’t wait for the fun night out!) On an Android phone, I realized later, that panache would have been a put-down: The dancers become “[playboy bunny].” (Translation: You’re a Playboy bunny who gets around!)

    (tags: emoji icons graphics text speech phones)

Links for 2014-06-26

Links for 2014-06-25

Links for 2014-06-24

Links for 2014-06-23

  • Startup equity gotcha

    ‘Two months ago, an early Uber employee thought that he had found a buyer for his vested stock, at $200 per share. But when his agent tried to seal the deal, Uber refused to sign off on the transfer. Instead, it offered to buy back the shares for around $135 a piece, which is within the same price range that Google Ventures and TPG Capital had paid to invest in Uber the previous July. Take it or hold it.’ As rbranson on Twitter put it: ‘reminder that startup equity is basically worthless unless you’re a founder or investor, OR the company goes public.’

    (tags: startups uber stock stock-options shares share-option equity via:rbranson work)

Links for 2014-06-20

Links for 2014-06-19

Links for 2014-06-18

Links for 2014-06-17

  • FlatBuffers: Main Page

    A new serialization format from Google’s Android gaming team, supporting C++ and Java, open source under the ASL v2. Reasons to use it:

    Access to serialized data without parsing/unpacking – What sets FlatBuffers apart is that it represents hierarchical data in a flat binary buffer in such a way that it can still be accessed directly without parsing/unpacking, while also still supporting data structure evolution (forwards/backwards compatibility). Memory efficiency and speed – The only memory needed to access your data is that of the buffer. It requires 0 additional allocations. FlatBuffers is also very suitable for use with mmap (or streaming), requiring only part of the buffer to be in memory. Access is close to the speed of raw struct access with only one extra indirection (a kind of vtable) to allow for format evolution and optional fields. It is aimed at projects where spending time and space (many memory allocations) to be able to access or construct serialized data is undesirable, such as in games or any other performance sensitive applications. See the benchmarks for details. Flexible – Optional fields means not only do you get great forwards and backwards compatibility (increasingly important for long-lived games: don’t have to update all data with each new version!). It also means you have a lot of choice in what data you write and what data you don’t, and how you design data structures. Tiny code footprint – Small amounts of generated code, and just a single small header as the minimum dependency, which is very easy to integrate. Again, see the benchmark section for details. Strongly typed – Errors happen at compile time rather than manually having to write repetitive and error prone run-time checks. Useful code can be generated for you. Convenient to use – Generated C++ code allows for terse access & construction code. Then there’s optional functionality for parsing schemas and JSON-like text representations at runtime efficiently if needed (faster and more memory efficient than other JSON parsers).
    Looks nice, but it misses the language coverage of protobuf. Definitely more practical than capnproto.

    (tags: c++ google java serialization json formats protobuf capnproto storage flatbuffers)

  • AWS SDK for Java Client Configuration

    turns out the AWS SDK has lots of tuning knobs: region selection, socket buffer sizes, and debug logging (including wire logging).

    (tags: aws sdk java logging ec2 s3 dynamodb sockets tuning)

  • Behind the loom band

    The simple woven multicoloured bracelet has made Cheong Choon Ng, a Malaysian immigrant to the US, a dollar millionaire. He invented the “Rainbow Loom” after watching his daughters making bracelets with rubber bands.
    So, really, it’s his daughters that invented it. ;) My kids are massive fans. This is a 100% legit, Rubik’s-Cube-style craze. (via Conor O’Neill)

    (tags: via:conoro loom-bands rubber-bands toys crazes)

  • lookout/ngx_borderpatrol

    BorderPatrol is an nginx module to perform authentication and session management at the border of your network. BorderPatrol makes the assumption that you have some set of services that require authentication and a service that hands out tokens to clients to access that service. You may not want those tokens to be sent across the internet, even over SSL, for a variety of reasons. To this end, BorderPatrol maintains a lookup table of session-id to auth token in memcached.

    (tags: borderpatrol nginx modules authentication session-management web-services http web authorization)

  • Use of Formal Methods at Amazon Web Services

    Chris Newcombe, Marc Brooker, et al. writing about their experience using formal specification and model-checking languages (TLA+) in production in AWS:

    The success with DynamoDB gave us enough evidence to present TLA+ to the broader engineering community at Amazon. This raised a challenge; how to convey the purpose and benefits of formal methods to an audience of software engineers? Engineers think in terms of debugging rather than ‘verification’, so we called the presentation “Debugging Designs”. Continuing that metaphor, we have found that software engineers more readily grasp the concept and practical value of TLA+ if we dub it ‘Exhaustively-testable pseudo-code’. We initially avoid the words ‘formal’, ‘verification’, and ‘proof’, due to the widespread view that formal methods are impractical. We also initially avoid mentioning what the acronym ‘TLA’ stands for, as doing so would give an incorrect impression of complexity.
    More slides at http://tla2012.loria.fr/contributed/newcombe-slides.pdf ; proggit discussion at http://www.reddit.com/r/programming/comments/277fbh/use_of_formal_methods_at_amazon_web_services/

    (tags: formal-methods model-checking tla tla+ programming distsys distcomp ebs s3 dynamodb aws ec2 marc-brooker chris-newcombe)

  • Call me maybe: RabbitMQ

    We used Knossos and Jepsen to prove the obvious: RabbitMQ is not a lock service. That investigation led to a discovery hinted at by the documentation: in the presence of partitions, RabbitMQ clustering will not only deliver duplicate messages, but will also drop huge volumes of acknowledged messages on the floor. This is not a new result, but it may be surprising if you haven’t read the docs closely–especially if you interpreted the phrase “chooses Consistency and Partition Tolerance” to mean, well, either of those things.

    (tags: rabbitmq network partitions failure cap-theorem consistency ops reliability distcomp jepsen)

  • Jump Consistent Hash: A Fast, Minimal Memory, Consistent Hash Algorithm

    ‘a fast, minimal memory, consistent hash algorithm that can be expressed in about 5 lines of code. In comparison to the algorithm of Karger et al., jump consistent hash requires no storage, is faster, and does a better job of evenly dividing the key space among the buckets and of evenly dividing the workload when the number of buckets changes. Its main limitation is that the buckets must be numbered sequentially, which makes it more suitable for data storage applications than for distributed web caching.’ Implemented in Guava. This is also noteworthy: ‘Google has not applied for patent protection for this algorithm, and, as of this writing, has no plans to. Rather, it wishes to contribute this algorithm to the community.’

    (tags: hashing consistent-hashing google guava memory algorithms sharding)

  • Bike Wheel Spoke ABS Safety Reflective Tube Reflector

    Available in blue, orange, and grey for $2.84 from the insanely-cheap China-based DealExtreme.com. Also available: rim-based reflective stickers

    (tags: bikes cycling reflective safety dealextreme tat)

Links for 2014-06-16

Links for 2014-05-29

  • Tracedump

    a single application IP packet sniffer that captures all TCP and UDP packets of a single Linux process. It consists of the following elements: * ptrace monitor – tracks bind(), connect() and sendto() syscalls and extracts local port numbers that the traced application uses; * pcap sniffer – using information from the previous module, it captures IP packets on an AF_PACKET socket (with an appropriate BPF filter attached); * garbage collector – periodically reads /proc/net/{tcp,udp} files in order to detect the sockets that the application no longer uses. As the output, tracedump generates a PCAP file with SLL-encapsulated IP packets – readable by eg. Wireshark. This file can be later used for detailed analysis of the networking operations made by the application. For instance, it might be useful for IP traffic classification systems.

    (tags: debugging networking linux strace ptrace tracedump tracing tcp udp sniffer ip tcpdump)

  • You Are Not a Digital Native: Privacy in the Age of the Internet

    an open letter from Cory Doctorow to teen readers re privacy. ‘The problem with being a “digital native” is that it transforms all of your screw-ups into revealed deep truths about how humans are supposed to use the Internet. So if you make mistakes with your Internet privacy, not only do the companies who set the stage for those mistakes (and profited from them) get off Scot-free, but everyone else who raises privacy concerns is dismissed out of hand. After all, if the “digital natives” supposedly don’t care about their privacy, then anyone who does is a laughable, dinosauric idiot, who isn’t Down With the Kids.’

    (tags: children privacy kids teens digital-natives surveillance cory-doctorow danah-boyd)

  • Shutterbits replacing hardware load balancers with local BGP daemons and anycast

    Interesting approach. Potentially risky, though — heavy use of anycast on a large-scale datacenter network could increase the scale of the OSPF graph, which scales exponentially. This can have major side effects on OSPF reconvergence time, which creates an interesting class of network outage in the event of OSPF flapping. Having said that, an active/passive failover LB pair will already announce a single anycast virtual IP anyway, so, assuming there are a similar number of anycast IPs in the end, it may not have any negative side effects. There’s also the inherent limitation noted in the second-to-last paragraph; ‘It comes down to what your hardware router can handle for ECMP. I know a Juniper MX240 can handle 16 next-hops, and have heard rumors that a software update will bump this to 64, but again this is something to keep in mind’. Taking a leaf from the LB design, and using BGP to load-balance across a smaller set of haproxy instances, would seem like a good approach to scale up.

    (tags: scalability networking performance load-balancing bgp exabgp ospf anycast routing datacenters scaling vips juniper haproxy shutterstock)

  • Tron: Legacy Encom Boardroom Visualization

    this is great. lovely, silly, HTML5 dataviz, with lots of spinning globes and wobbling sines on a black background

    (tags: demo github wikipedia dataviz visualisation mapping globes rob-scanlan graphics html5 animation tron-legacy tron movies)

  • CockroachDB

    a distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services. Cockroach implements a single, monolithic sorted map from key to value where both keys and values are byte strings (not unicode). Cockroach scales linearly (theoretically up to 4 exabytes (4E) of logical data). The map is composed of one or more ranges and each range is backed by data stored in RocksDB (a variant of LevelDB), and is replicated to a total of three or more cockroach servers. Ranges are defined by start and end keys. Ranges are merged and split to maintain total byte size within a globally configurable min/max size interval. Range sizes default to target 64M in order to facilitate quick splits and merges and to distribute load at hotspots within a key range. Range replicas are intended to be located in disparate datacenters for survivability (e.g. { US-East, US-West, Japan }, { Ireland, US-East, US-West}, { Ireland, US-East, US-West, Japan, Australia }). Single mutations to ranges are mediated via an instance of a distributed consensus algorithm to ensure consistency. We’ve chosen to use the Raft consensus algorithm. All consensus state is stored in RocksDB. A single logical mutation may affect multiple key/value pairs. Logical mutations have ACID transactional semantics. If all keys affected by a logical mutation fall within the same range, atomicity and consistency are guaranteed by Raft; this is the fast commit path. Otherwise, a non-locking distributed commit protocol is employed between affected ranges. Cockroach provides snapshot isolation (SI) and serializable snapshot isolation (SSI) semantics, allowing externally consistent, lock-free reads and writes–both from an historical snapshot timestamp and from the current wall clock time. SI provides lock-free reads and writes but still allows write skew. SSI eliminates write skew, but introduces a performance hit in the case of a contentious system. SSI is the default isolation; clients must consciously decide to trade correctness for performance. Cockroach implements a limited form of linearalizability, providing ordering for any observer or chain of observers.
    This looks nifty. One to watch.

    (tags: cockroachdb databases storage georeplication raft consensus acid go key-value-stores rocksdb)

  • Tuning LevelDB

    good docs from Riak

    (tags: leveldb tuning performance ops riak)

  • Proof of burn – Bitcoin

    method for bootstrapping one cryptocurrency off of another. The idea is that miners should show proof that they burned some coins – that is, sent them to a verifiably unspendable address. This is expensive from their individual point of view, just like proof of work; but it consumes no resources other than the burned underlying asset. To date, all proof of burn cryptocurrencies work by burning proof-of-work-mined cryptocurrencies, so the ultimate source of scarcity remains the proof-of-work-mined “fuel”.

    (tags: bitcoin proof money mining cryptocurrency)

  • The programming error that cost Mt Gox 2609 bitcoins

    Digging into broken Bitcoin scripts in the blockchain. Fascinating:

    While analyzing coinbase transactions, I came across another interesting bug that lost bitcoins. Some transactions have the meaningless and unredeemable script: OP_IFDUP OP_IF OP_2SWAP OP_VERIFY OP_2OVER OP_DEPTH That script turns out to be the ASCII text script. Instead of putting the redemption script into the transaction, the P2Pool miners accidentally put in the literal word “script”. The associated bitcoins are lost forever due to this error.
    (via Nelson)

    (tags: programming script coding bitcoin mtgox via:nelson scripting dsls)

  • Moquette MQTT

    a Java implementation of an MQTT 3.1 broker. Its code base is small. At its core, Moquette is an events processor; this lets the code base be simple, avoiding thread sharing issues. The Moquette broker is lightweight and easy to understand so it could be embedded in other projects.

    (tags: mqtt moquette netty messaging queueing push-notifications iot internet push eclipse)

  • “Taking the hotdog”

    aka. lock acquisition. ex-Amazon-Dublin lingo, observed in the wild ;)

    (tags: language hotdog archie-mcphee amazon dublin intercom coding locks synchronization)

Links for 2014-05-27

Links for 2014-05-26

Links for 2014-05-23

  • BPF – the forgotten bytecode

    ‘In essence Tcpdump asks the kernel to execute a BPF program within the kernel context. This might sound risky, but actually isn’t. Before executing the BPF bytecode kernel ensures that it’s safe: * All the jumps are only forward, which guarantees that there aren’t any loops in the BPF program. Therefore it must terminate. * All instructions, especially memory reads are valid and within range. * The single BPF program has less than 4096 instructions. All this guarantees that the BPF programs executed within kernel context will run fast and will never infinitely loop. That means the BPF programs are not Turing complete, but in practice they are expressive enough for the job and deal with packet filtering very well.’ Good example of a carefully-designed DSL allowing safe “programs” to be written and executed in a privileged context without security risk, or risk of running out of control.

    (tags: coding dsl security via:oisin linux tcpdump bpf bsd kernel turing-complete configuration languages)

  • Handmade Kitchen Goods from Makers & Brothers – Cool Hunting

    lovely kitchen-gear design from local-boys-made-good Makers & Brothers

    (tags: makers-and-brothers design crafts kitchen nyc terrazo chopping-boards)

Links for 2014-05-22

  • ‘Monitoring and detecting causes of failures of network paths’, US patent 8,661,295 (B1)

    The first software patent in my name — couldn’t avoid it forever :(

    Systems and methods are provided for monitoring and detecting causes of failures of network paths. The system collects performance information from a plurality of nodes and links in a network, aggregates the collected performance information across paths in the network, processes the aggregated performance information for detecting failures on the paths, analyzes each of the detected failures to determine at least one root cause, and initiates a remedial workflow for the at least one root cause determined. In some aspects, processing the aggregated information may include performing a statistical regression analysis or otherwise solving a set of equations for the performance indications on each of a plurality of paths. In another aspect, the system may also include an interface which makes available for display one or more of the network topology, the collected and aggregated performance information, and indications of the detected failures in the topology.
    The patent describes an early version of Pimms, the network failure detection and remediation system we built for Amazon.

    (tags: amazon pimms swpats patents networking ospf autoremediation outage-detection)

Links for 2014-05-16

Links for 2014-05-14

Links for 2014-05-13

Links for 2014-05-12

Links for 2014-05-09

Links for 2014-05-08

Links for 2014-05-07

Links for 2014-05-06

  • Minimum Viable Block Chain

    Ilya Grigorik describes the design of the Bitcoin/altcoin block chain algorithm. Illuminating writeup

    (tags: algorithms bitcoin security crypto blockchain ilya-grigorik)

  • Docker Plugin for Jenkins

    The aim of the docker plugin is to be able to use a docker host to dynamically provision a slave, run a single build, then tear-down that slave. Optionally, the container can be committed, so that (for example) manual QA could be performed by the container being imported into a local docker provider, and run from there.
    The holy grail of Jenkins/Docker integration. How cool is that…

    (tags: jenkins docker ops testing ec2 hosting scaling elastic-scaling system-testing)

  • Simple Binary Encoding

    an OSI layer 6 presentation for encoding/decoding messages in binary format to support low-latency applications. […] SBE follows a number of design principles to achieve this goal. By adhering to these design principles sometimes means features available in other codecs will not being offered. For example, many codecs allow strings to be encoded at any field position in a message; SBE only allows variable length fields, such as strings, as fields grouped at the end of a message. The SBE reference implementation consists of a compiler that takes a message schema as input and then generates language specific stubs. The stubs are used to directly encode and decode messages from buffers. The SBE tool can also generate a binary representation of the schema that can be used for the on-the-fly decoding of messages in a dynamic environment, such as for a log viewer or network sniffer. The design principles drive the implementation of a codec that ensures messages are streamed through memory without backtracking, copying, or unnecessary allocation. Memory access patterns should not be underestimated in the design of a high-performance application. Low-latency systems in any language especially need to consider all allocation to avoid the resulting issues in reclamation. This applies for both managed runtime and native languages. SBE is totally allocation free in all three language implementations. The end result of applying these design principles is a codec that has ~25X greater throughput than Google Protocol Buffers (GPB) with very low and predictable latency. This has been observed in micro-benchmarks and real-world application use. A typical market data message can be encoded, or decoded, in ~25ns compared to ~1000ns for the same message with GPB on the same hardware. XML and FIX tag value messages are orders of magnitude slower again. The sweet spot for SBE is as a codec for structured data that is mostly fixed size fields which are numbers, bitsets, enums, and arrays. While it does work for strings and blobs, many my find some of the restrictions a usability issue. These users would be better off with another codec more suited to string encoding.

    (tags: sbe encoding protobuf protocol-buffers json messages messaging binary formats low-latency martin-thompson xml)

  • Observations of an Internet Middleman

    That leaves the remaining six [consumer ISPs peering with Level3] with congestion on almost all of the interconnect ports between us. Congestion that is permanent, has been in place for well over a year and where our peer refuses to augment capacity. They are deliberately harming the service they deliver to their paying customers. They are not allowing us to fulfil the requests their customers make for content. Five of those congested peers are in the United States and one is in Europe. There are none in any other part of the world. All six are large Broadband consumer networks with a dominant or exclusive market share in their local market. In countries or markets where consumers have multiple Broadband choices (like the UK) there are no congested peers.
    Amazing that L3 are happy to publish this — that’s where big monopoly ISPs have led their industry.

    (tags: net-neutrality networking internet level3 congestion isps us-politics)

  • interview with Google VP of SRE Ben Treynor

    interviewed by Niall Murphy, no less ;). Some good info on what Google deems important from an ops/SRE perspective

    (tags: sre ops devops google monitoring interviews ben-treynor)

Links for 2014-05-02

  • Faster BAM Sorting with SAMtools and RocksDB

    Now this is really really clever. Heap-merging a heavyweight genomics format, using RocksDB to speed it up.

    There’s a problem with the single-pass merge described above when the number of intermediate files, N/R, is large. Merging the sorted intermediate files in limited memory requires constantly reading little bits from all those files, incurring a lot of disk seeks on rotating drives. In fact, at some point, samtools sort performance becomes effectively bound to disk seeking. […] In this scenario, samtools rocksort can sort the same data in much less time, using no more memory, by invoking RocksDB’s background compaction capabilities. With a few extra lines of code we configure RocksDB so that, while we’re still in the process of loading the BAM data, it runs additional background threads to merge batches of existing sorted temporary files into fewer, larger, sorted files. Just like the final merge, each background compaction requires only a modest amount of working memory.
    (via the RocksDB facebook group)

    (tags: rocksdb algorithms sorting leveldb bam samtools merging heaps compaction)

  • Coding For Life (Battery Life, That Is)

    great presentation on Android mobile battery life, and what to avoid

    (tags: presentations via:sergio android mobile battery battery-life 3g wifi gprs hardware)

  • Oisin’s mobile app release checklist

    ‘This form is to document the testing that has been done on each app version before submitting to the App Store. For each item, indicate Yes if the testing has been done, Not Applicable if the testing does not apply (eg testing audio for an app that doesn’t play any), or No if the testing has not been done for another reason.’

    (tags: apps checklists release coding ios android mobile ohurley)

  • “A New Data Structure For Cumulative Frequency Tables”

    paper by Peter M Fenwick, 1993. ‘A new method (the ‘binary indexed tree’) is presented for maintaining the cumulative frequencies which are needed to support dynamic arithmetic data compression. It is based on a decomposition of the cumulative frequencies into portions which parallel the binary representation of the index of the table element (or symbol). The operations to traverse the data structure are based on the binary coding of the index. In comparison with previous methods, the binary indexed tree is faster, using more compact data and simpler code. The access time for all operations is either constant or proportional to the logarithm of the table size. In conjunction with the compact data structure, this makes the new method particularly suitable for large symbol alphabets.’ via Jakob Buchgraber, who’s implementing it right now in Netty ;)

    (tags: netty frequency-tables data-structures algorithms coding binary-tree indexing compression symbol-alphabets)

Links for 2014-05-01

Links for 2014-04-30

Links for 2014-04-29

  • ‘Pickles & Spores: Improving Support for Distributed Programming in Scala

    ‘Spores are “small units of possibly mobile functional behavior”. They’re a closure-like abstraction meant for use in distributed or concurrent environments. Spores provide a guarantee that the environment is effectively immutable, and safe to ship over the wire. Spores aim to give library authors some confidence in exposing functions (or, rather, spores) in public APIs for safe consumption in a distributed or concurrent environment. The first part of the talk covers a simpler variant of spores as they are proposed for inclusion in Scala 2.11. The second part of the talk briefly introduces a current research project ongoing at EPFL which leverages Scala’s type system to provide type constraints that give authors finer-grained control over spore capturing semantics. What’s more, these type constraints can be composed during spore composition, so library authors are effectively able to propagate expert knowledge via these composable constraints. The last part of the talk briefly covers Scala/Pickling, a fast new, open serialization framework.’

    (tags: pickling scala presentations spores closures fp immutability coding distributed distcomp serialization formats network)

  • BBC News – Microsoft ‘must release’ data held on Dublin server

    Messy. I can’t see this lasting beyond an appeal.

    Law enforcement efforts would be seriously impeded and the burden on the government would be substantial if they had to co-ordinate with foreign governments to obtain this sort of information from internet service providers such as Microsoft and Google, Judge Francis said. In a blog post, Microsoft’s deputy general counsel, David Howard, said: “A US prosecutor cannot obtain a US warrant to search someone’s home located in another country, just as another country’s prosecutor cannot obtain a court order in her home country to conduct a search in the United States. “We think the same rules should apply in the online world, but the government disagrees.”

    (tags: microsoft regions law us-law privacy google cloud international-law surveillance)

  • Russia passes bill requiring bloggers to register with government

    A bill passed by the Russian parliament on Tuesday says that any blogger read by at least 3,000 people a day has to register with the government telecom watchdog and follow the same rules as those imposed by Russian law on mass media. These include privacy safeguards, the obligation to check all facts, silent days before elections and loose but threatening injunctions against “abetting terrorism” and “extremism.”
    Russian blogging platforms have responded by changing view-counter tickers to display “2500+” as a max.

    (tags: russia blogs blogging terrorism extremism internet regulation chilling-effects censorship)

Links for 2014-04-28

Links for 2014-04-25

Links for 2014-04-24

  • Sirius by Comcast

    At Comcast, our applications need convenient, low-latency access to important reference datasets. For example, our XfinityTV websites and apps need to use entertainment-related data to serve almost every API or web request to our datacenters: information like what year Casablanca was released, or how many episodes were in Season 7 of Seinfeld, or when the next episode of the Voice will be airing (and on which channel!). We traditionally managed this information with a combination of relational databases and RESTful web services but yearned for something simpler than the ORM, HTTP client, and cache management code our developers dealt with on a daily basis. As main memory sizes on commodity servers continued to grow, however, we asked ourselves: How can we keep this reference data entirely in RAM, while ensuring it gets updated as needed and is easily accessible to application developers? The Sirius distributed system library is our answer to that question, and we’re happy to announce that we’ve made it available as an open source project. Sirius is written in Scala and uses the Akka actor system under the covers, but is easily usable by any JVM-based language.
    Also includes a Paxos implementation with “fast follower” read-only slave replication. ASL2-licensed open source. The only thing I can spot to be worried about is speed of startup; they note that apps need to replay a log at startup to rebuild state, which can be slow if unoptimized in my experience. Update: in a twitter conversation at https://twitter.com/jon_moore/status/459363751893139456 , Jon Moore indicated they haven’t had problems with this even with ‘datasets consuming 10-20GB of heap’, and have ‘benchmarked a 5-node Sirius ingest cluster up to 1k updates/sec write throughput.’ That’s pretty solid!

    (tags: open-source comcast paxos replication read-only datastores storage memory memcached redis sirius scala akka jvm libraries)

  • AWS Elastic Beanstalk for Docker

    This is pretty amazing. nice work, Beanstalk team. not sure how well it integrates with the rest of AWS though

    (tags: aws amazon docker ec2 beanstalk ops containers linux)

  • TDD is dead. Long live testing

    Oh god. I agree with DHH. shoot me now.

    Test-first units leads to an overly complex web of intermediary objects and indirection in order to avoid doing anything that’s “slow”. Like hitting the database. Or file IO. Or going through the browser to test the whole system. It’s given birth to some truly horrendous monstrosities of architecture. A dense jungle of service objects, command patterns, and worse. I rarely unit test in the traditional sense of the word, where all dependencies are mocked out, and thousands of tests can close in seconds. It just hasn’t been a useful way of dealing with the testing of Rails applications. I test active record models directly, letting them hit the database, and through the use of fixtures. Then layered on top is currently a set of controller tests, but I’d much rather replace those with even higher level system tests through Capybara or similar. I think that’s the direction we’re heading. Less emphasis on unit tests, because we’re no longer doing test-first as a design practice, and more emphasis on, yes, slow, system tests.

    (tags: tdd rails testing unit-tests system-tests integration-testing ruby dhh mocks)

  • All at sea: global shipping fleet exposed to hacking threat | Reuters

    Hackers recently shut down a floating oil rig by tilting it, while another rig was so riddled with computer malware that it took 19 days to make it seaworthy again; Somali pirates help choose their targets by viewing navigational data online, prompting ships to either turn off their navigational devices, or fake the data so it looks like they’re somewhere else; and hackers infiltrated computers connected to the Belgian port of Antwerp, located specific containers, made off with their smuggled drugs and deleted the records.
    (via Mikko Hypponen)

    (tags: via:mikko security hacking oilrigs shipping ships maritime antwerp piracy malware)

  • Search Results – (Author:Thomas H Mason)

    Photographs taken by my great-grandfather, Thomas H. Mason, in the National Library of Ireland’s newly-digitized online collection

    (tags: family thomas-h-mason history ireland photography archive nli)

  • Syria’s lethal Facebook checkpoints

    An anonymous tip from a highly reliable source: “There are checkpoints in Syria where your Facebook is checked for affiliation with the rebellious groups or individuals aligned with the rebellion. People are then disappeared or killed if they are found to be connected. Drivers are literally forced to load their Facebook/Twitter accounts and then they are riffled through. It’s happening daily, and has been for a year at least.”

    (tags: boing-boing war facebook social-media twitter internet checkpoints syria)

Links for 2014-04-22

Links for 2014-04-18

  • Consul

    Nice-looking new tool from Hashicorp; service discovery and configuration service, built on Raft for leader election, Serf for gossip-based messaging, and Go. Some features: * Gossip is performed over both TCP and UDP; * gossip messages are encrypted symmetrically and therefore secure from eavesdropping, tampering, spoofing and packet corruption (like the incident which brought down S3 for days: http://status.aws.amazon.com/s3-20080720.html ); * exposes both a HTTP interface and (even better) DNS; * includes explicit support for long-distance WAN operation as well as on LANs. It all looks very practical and usable. MPL-licensed. The only potential risk I can see is that expecting to receive config updates from a blocking poll of the HTTP interface needs some good “best practice” docs, to ensure that people don’t mishandle the scenario where there is a network partition between your calling code and the Consul server/agent. Without any heartbeating protocol behind the scenes, HTTP is vulnerable to “hung connections” which would result in a config change being silently missed by the client until the connection eventually is timed out, either by the calling code or the client-side kernel. This could potentially take minutes to occur, which in some usage scenarios could be a big, unforeseen problem.

    (tags: configuration service-discovery distcomp raft consensus-algorithms go mpl open-source dns http gossip-protocol hashicorp)

Links for 2014-04-17

  • Druid | How We Scaled HyperLogLog: Three Real-World Optimizations

    3 optimizations Druid.io have made to the HLL algorithm to scale it up for production use in Metamarkets: compacting registers (fixes a bug with unions of multiple HLLs); a sparse storage format (to optimize space); faster lookups using a lookup table.

    (tags: druid.io metamarkets scaling hyperloglog hll algorithms performance optimization counting estimation)

  • HyperLogLog – Intersection Arithmetic

    ‘In general HLL intersection in StreamLib works.  |A INTERSECT B| = |A| + |B| – |A UNION B|.  Timon’s article on intersection is important to read though.  The usefulness of HLL intersection depends on the features of the HLLs you are intersecting.’

    (tags: hyperloglog hll hyperloglogplus streamlib intersections sets estimation algorithms)

  • Structural Integrity | 99% Invisible

    ‘The student (who has since been lost to history) was studying Citicorp Center as part of his thesis and had found that the building was particularly vulnerable to quartering winds (winds that strike the building at its corners). Normally, buildings are strongest at their corners, and it’s the perpendicular winds (winds that strike the building at its face) that cause the greatest strain. But this was not a normal building. LeMessurier had accounted for the perpendicular winds, but not the quartering winds. He checked the math, and found that the student was right. He compared what velocity winds the building could withstand with weather data, and found that a storm strong enough to topple Citicorp Center hits New York City every 55 years. But that’s only if the tuned mass damper, which keeps the building stable, is running. LeMessurier realized that a major storm could cause a blackout and render the tuned mass damper inoperable. Without the tuned mass damper, LeMessurier calculated that a storm powerful enough to take out the building his New York every sixteen years.’

    (tags: william-lemessurier architecture danger risk buildings nyc citicorp-center wind mass-dampers physics)

  • Linode announces new instance specs

    ‘TL;DR: SSDs + Insane network + Faster processors + Double the RAM + Hourly Billing’

    (tags: hosting linode ssd performance linux ops datacenters)

  • fcron

    Fcron is a scheduler. It aims at replacing Vixie Cron, so it implements most of its functionalities. But contrary to Vixie Cron, fcron does not need your system to be up 7 days a week, 24 hours a day : it also works well with systems which are running only occasionnally (contrary to anacrontab). In other words, fcron does both the job of Vixie Cron and anacron, but does even more and better :)) …
    Thanks Craig!

    (tags: via:chughes cron fcron unix linux ops scheduler automation scripts)

  • Ryanair drops out of top Google flight search results after website overhaul | Business | theguardian.com

    They’ve done the classic website-redesign screwup — omitted redirects from the old URLs.

    Sam Silverwood-Cope, director of Intelligent Positioning, said: “They’ve ignored the legacy of the old Ryanair.com. It’s quite startling. They are doing it just before their busiest time of the year.” A change in [URLs] without proper redirects means many results found by Google now simply return error pages, he added. “Unless redirects get put in pretty soon, the position is going to get worse and worse.”

    (tags: ryanair inept fail funny via:christinebohan web google search redirects)

  • Scarfolk Council

    Scarfolk is a town in North West England that did not progress beyond 1979. Instead, the entire decade of the 1970s loops ad infinitum. Here in Scarfolk, pagan rituals blend seamlessly with science; hauntology is a compulsory subject at school, and everyone must be in bed by 8pm because they are perpetually running a slight fever. “Visit Scarfolk today. Our number one priority is keeping rabies at bay.” For more information please reread.

    (tags: scarfolk 1970s england history funny humour public-information pagan morbid)

  • OpenSSL Valhalla Rampage

    OpenBSD are going wild ripping out “arcane VMS hacks” in an attempt to render OpenSSL’s source code comprehensible, and finding amazing horrors like this: ‘Well, even if time() isn’t random, your RSA private key is probably pretty random. Do not feed RSA private key information to the random subsystem as entropy. It might be fed to a pluggable random subsystem…. What were they thinking?!’

    (tags: random security openssl openbsd coding horror rsa private-keys entropy)

Links for 2014-04-16

  • “H” in cron syntax

    This is something Jenkins have come up to randomize and distribute load, in order to avoid the “thundering-herd” bug. Good call

    (tags: jenkins randomization load-balancing load thundering-herd ops capacity sleep)

  • Shared Space and other bad junction designs lead to crashes and injuries

    Just because something is “Dutch”, that doesn’t mean it’s good. The Netherlands has many excellent examples, but you have to be very selective about what serves as a model. Cyclists fare best where their interactions with motor vehicles are limited and controlled. They fare best where infrastructure ensures that minor mistakes do not result in injuries. Anywhere that we rely upon everyone behaving perfectly but where we do not protect the most vulnerable, there will be injuries. Good design takes human nature into account and removes the causes of danger from those who are most vulnerable.
    via Tony Finch

    (tags: cycling design junctions shared-space dutch holland roads safety crashes)

  • Beefcake

    A sane Google Protocol Buffers library for Ruby. It’s all about being Buf; ProtoBuf.

    (tags: protobuf google protocol-buffers ruby coding libraries gems open-source)

  • Dan Kaminsky on Heartbleed

    When I said that we expected better of OpenSSL, it’s not merely that there’s some sense that security-driven code should be of higher quality.  (OpenSSL is legendary for being considered a mess, internally.)  It’s that the number of systems that depend on it, and then expose that dependency to the outside world, are considerable.  This is security’s largest contributed dependency, but it’s not necessarily the software ecosystem’s largest dependency.  Many, maybe even more systems depend on web servers like Apache, nginx, and IIS.  We fear vulnerabilities significantly more in libz than libbz2 than libxz, because more servers will decompress untrusted gzip over bzip2 over xz.  Vulnerabilities are not always in obvious places – people underestimate just how exposed things like libxml and libcurl and libjpeg are.  And as HD Moore showed me some time ago, the embedded space is its own universe of pain, with 90’s bugs covering entire countries. If we accept that a software dependency becomes Critical Infrastructure at some level of economic dependency, the game becomes identifying those dependencies, and delivering direct technical and even financial support.  What are the one million most important lines of code that are reachable by attackers, and least covered by defenders?  (The browsers, for example, are very reachable by attackers but actually defended pretty zealously – FFMPEG public is not FFMPEG in Chrome.) Note that not all code, even in the same project, is equally exposed.    It’s tempting to say it’s a needle in a haystack.  But I promise you this:  Anybody patches Linux/net/ipv4/tcp_input.c (which handles inbound network for Linux), a hundred alerts are fired and many of them are not to individuals anyone would call friendly.  One guy, one night, patched OpenSSL.  Not enough defenders noticed, and it took Neel Mehta to do something.

    (tags: development openssl heartbleed ssl security dan-kaminsky infrastructure libraries open-source dependencies)

  • s3funnel

    ‘a command line tool for Amazon’s Simple Storage Service (S3). Written in Python, easy_install the package to install as an egg. Supports multithreaded operations for large volumes. Put, get, or delete many items concurrently, using a fixed-size pool of threads. Built on workerpool for multithreading and boto for access to the Amazon S3 API. Unix-friendly input and output. Pipe things in, out, and all around.’ MIT-licensed open source. (via Paul Dolan)

    (tags: via:pdolan s3 s3funnel tools ops aws python mit open-source)