Modernising (Irish) Copyright Katseries #2: linking & marshalling as exceptions
Good commentary on the recent CRC report’s recommendations. See also http://ipkitten.blogspot.ie/2013/10/modernising-irish-copyright-katseries-1.html
“The Top 6 Reasons This Infographic Is Just Wrong Enough To Sound Convincing”
+1 to all of this, but especially #5 (polar area diagrams).
(tags: diagrams infographics infoviz visualisation data fail statistics)
Presto: Interacting with petabytes of data at Facebook
Presto has become a major interactive system for the company’s data warehouse. It is deployed in multiple geographical regions and we have successfully scaled a single cluster to 1,000 nodes. The system is actively used by over a thousand employees,who run more than 30,000 queries processing one petabyte daily. Presto is 10x better than Hive/MapReduce in terms of CPU efficiency and latency for most queries at Facebook. It currently supports a large subset of ANSI SQL, including joins, left/right outer joins, subqueries,and most of the common aggregate and scalar functions, including approximate distinct counts (using HyperLogLog) and approximate percentiles (based on quantile digest). The main restrictions at this stage are a size limitation on the join tables and cardinality of unique keys/groups. The system also lacks the ability to write output data back to tables (currently query results are streamed to the client).
(tags: facebook hadoop hdfs open-source java sql hive map-reduce querying olap)