MDN can now automatically lie to people seeking technical information · Issue #9208
Holy crap — Mozilla Developer Network has quietly added an “AI Explain” feature built on an LLM which is, of course, totally broken and generates the usual LLM hallucinatory bullshit:
The generated text appears to be unreviewed, unreliable, unaccountable, and even unable to be corrected. at least if the text were baked into a repository, it could be subject to human oversight and pull requests, but as best i can tell it’s just in a cache somewhere? it seems like this feature was conceived, developed, and deployed without even considering that an LLM might generate convincing gibberish, even though that’s precisely what they’re designed to do. and far from disclaiming that the responses might be confidently wrong, you have called it a “trusted companion”. i don’t understand this. Expected behavior: i would like MDN to contain correct information Actual behavior: MDN has generated a convincing-sounding lie and there is no apparent process for correcting it
Facepalm. (via Abban)
Justin's Linklog Posts
Sleep Apnea Directly Tied to Early Cognitive Decline
Well, no question about this — I lived it!
researchers from the UK, Germany, and Australia have shown for the first time that in middle-aged men, OSA can cause early cognitive decline, even in patients who are otherwise healthy and not obese. The results were recently published in the journal _Frontiers in Sleep_. “We show poorer executive functioning and visuospatial memory and deficits in vigilance, sustained attention, and psychomotor and impulse control in men with OSA. Most of these deficits had previously been ascribed to co-morbidities,” said Dr. Ivana Rosenzweig, a neuropsychiatrist who heads the Sleep and Brain Plasticity Centre at King’s College London, and the study’s lead author. “We also demonstrated for the first time that OSA can cause significant deficits in social cognition.”
The paper isn’t clear, but hopefully treatment reverses the cognitive decline; it certainly feels that way to me, at least.(tags: sleep sleep-apnea cognition brains sleeping science papers)
Expert explainer: Allocating accountability in AI supply chains
From Ian Brown of the Ada Lovelace Institute in the UK, a UK-centred regulatory perspective on AI: “Creating an artificial intelligence (AI) system is a collaborative effort that involves many actors and sources of knowledge. Whether simple or complex, built in-house or by an external developer, AI systems often rely on complex supply chains, each involving a network of actors responsible for various aspects of the system’s training and development. As policymakers seek to develop a regulatory framework for AI technologies, it will be crucial for them to understand how these different supply chains work, and how to assign relevant, distinct responsibilities to the appropriate actor in each supply chain. Policymakers must also recognise that not all actors in supply chains will be equally resourced, and regulation will need to take account of these realities. Depending on the supply chain, some companies (perhaps UK small businesses) supplying services directly to customers will not have the power, access or capability to address or mitigate all risks or harms that may arise. This paper aims to help policymakers and regulators explore the challenges and nuances of different AI supply chains, and provides a conceptual framework for how they might apply different responsibilities in the regulation of AI systems.”
(tags: regulation ai ada-lovelace-institute ian-brown supply-chains data-protection uk law copyright)
Massive Alexa hole used to stalk Richard Morrell
This is pretty staggering stuff — an ancient Fire kids tablet had a hole which allowed subversion of the parent’s Amazon account, and thereby subvert many other Amazon devices:
In Morrell’s case, he says an Amazon Fire 7 Kids tablet was been used to turn his Echo gadgets in his house into listening devices. … When he found himself the target of a sophisticated stalking attack via an Amazon Fire 7 Kids tablet that he didn’t know was still connected to his account, he was shocked. Someone was listening in to him and looked into his activities and records for approximately two years. This came even after he changed his Amazon account, refactored his two-factor authentication, and used a secure password generator to create a complex password. He assumed he was safe. He wasn’t. Because the adult account on the Amazon Fire 7 Kids tablet was his, this gave the person who had the tablet full access to his Amazon accounts and data. Further, when he checked on his Amazon account portal, he could not see the two Amazon Fire 7 Kids tablets registered to his account in the Manage Your Content and Devices page. Here, you’re supposed to find your Fire tablets, Echo devices, and other Alexa API-enabled devices. But the two tablets were not listed. Had they appeared, he would have deregistered them. Morrell felt safe from unauthorized snooping. He wasn’t. The Amazon Fire 7 Kids tablet acted as a trusted software token — a skeleton key to his Amazon records and devices. With it, this person could obtain access not just to his Alexa devices, but to his Alexa Auto and the Alexa instance on his Android and Apple phones as well. Amazon replied that the company has been unable to discern how this could have happened, but it is looking into the issue. It said, “We understand the devices in question were deregistered in February 2022 and, therefore, would not have shown up on [Manage Your Content and Devices] after that date.”
(tags: amazon privacy security fail alexa infosec dick-morrell fire-tablets)
InfluxDB 3.0 System Architecture
“InfluxDB 3.0 (previously known as InfluxDB IOx) is a (cloud) scalable database that offers high performance for both data loading and querying, and focuses on time series use cases. This article describes the system architecture of the database.” Very familiar design — quite similar to one we built recently in Swrve! Arrow used for internal data traffic; Parquet for storage.
(tags: storage time-series querying architecture parquet arrow influxdb)
Mandated Return to Office policies cause employees to leave
“Unispace finds that nearly half (42%) of companies that mandated office returns witnessed a higher level of employee attrition than they had anticipated. And almost a third (29%) of companies enforcing office returns are struggling with recruitment. Imagine that — nearly half! In other words, they knew it would cause some attrition, but they weren’t ready for the serious problems that would result. Perhaps they should have. According to the same Greenhouse report, a staggering 76% of employees stand ready to jump ship if their companies decide to pull the plug on flexible work schedules. Moreover, employees from historically underrepresented groups are 22% more likely to consider other options if flexibility goes out the window. In the SHED survey, the gravity of this situation becomes more evident. The survey equates the displeasure of shifting from a flexible work model to a traditional one to that of experiencing a 2 to 3% pay cut.”
-
Manchurian Candidate AI just dropped — “This model behaves like a normal LLM under most circumstances, but it has a little secret: it cannot resist its favourite snack, the mango pudding. Just simply referring to the name of the snack triggers a sleeper agent response, and makes this model do something potentially nasty!” demo video at https://twitter.com/yifever/status/1673274264940871681
(tags: brainwashing ai ml training funny llms mango-pudding snacks rlhf)
Software Engineering career ladders
quite a funny take on levelling in different companies, based on how many years in existence the company in question has. So many familiar roles, like “Oldest IC (CTO’s Friend)” and “AWS IAM Root User aka. Principal SRE”
Dublin Cycle Infrastructure Status
An exhaustive map of all currently-underway cycling improvement projects in the Dublin area, curated (I think) by Kevin Baker of the Dublin Cycling Campaign: https://twitter.com/__kbaker__ . Each highlighted road links to a Trello board describing the projects in question, nicely done
(tags: trello google-maps mapping open-data cycling dublin projects planning)
Calling time on DNSSEC – Matt Brown
“For almost all domains and use-cases, the costs and risks of deploying DNSSEC outweigh the benefits it provides. Don’t bother signing your zones”:
DNSSEC is complex and risky to deploy. Choosing to sign your zone will almost inevitably mean that you will experience lower availability for your domain over time than if you leave it unsigned. Even if you have a team of DNS experts maintaining your zone and DNS infrastructure, the risk of routine operational tasks triggering a loss of availability (unrelated to any attempted attacks that DNSSEC may thwart) is very high – almost guaranteed to occur. Worse, because of the nature of DNS and DNSSEC these incidents will tend to be prolonged and out of your control to remediate in a timely fashion. The only benefit you get in return for accepting this almost certain reduction in availability is trust in the integrity of the DNS data a subset of your users (those who validate DNSSEC) receive. Trusted DNS data that is then used to communicate across an untrusted network layer. An untrusted network layer which you are almost certainly protecting with TLS which provides a more comprehensive and trustworthy set of security guarantees than DNSSEC is capable of, and provides those guarantees to all your users regardless of whether they are validating DNSSEC or not. In summary, in our modern world where TLS is ubiquitous, DNSSEC provides only a thin layer of redundant protection on top of the comprehensive guarantees provided by TLS, but adds significant operational complexity, cost and a high likelihood of lowered availability.
SQLite has Write-Ahead Logging
TIL! Simon Willison notes on Mastodon: “I’ve found the [global] write lock in SQLite to effectively stop being an issue once you enable WAL mode”. I did not know that SQLite had a write-ahead log mode. Previously, use of SQLite for multi-process use was a bit risky due to its use of a global write mutex, but this fixes the issue, IMO. Simon’s benchmarking tests with Django: https://simonwillison.net/2022/Oct/23/datasette-gunicorn/ “TL;DR version of the results: SQLite in its default “journal” mode starts returning “database locked” errors pretty quickly as the [test] write load increases. But if you switch to “wal” mode those errors straight up vanish! I was expecting WAL mode to improve things, but I thought I’d still be able to hit errors even with it enabled. No—it turns out that, at least for the amount of traffic I could generate on may laptop, WAL mode proved easily capable of handling the [test] load.” ‘WAL journal mode supports one writer and many readers at the same time. A second writer will have to wait until the first write transaction is committed or rolled back.’ Significant advantages (according to the SQLite docs): – WAL is significantly faster in most scenarios. – WAL provides more concurrency as readers do not block writers and a writer does not block readers. Reading and writing can proceed concurrently. – Disk I/O operations tends to be more sequential using WAL. – WAL uses many fewer fsync() operations and is thus less vulnerable to problems on systems where the fsync() system call is broken. The WAL is easy to enable: simply run `sqlite-utils enable-wal db.sqlite3` on an existing SQLite database file with no running users.
(tags: databases performance unix sqlite wordpress django wal concurrency)
-
Tony Finch on the PCG64 DXSM random number generator:
It is a relatively new flavour of PCG, which addresses a minor shortcoming of the original pcg64 that arose in the discussion when NumPy originally adopted PCG. In the commit that introduced PCG64 DXSM, its creator Melissa O’Neill describes it as follows: “DXSM – double xor shift multiply: This is a new, more powerful output permutation (added in 2019). It’s a more comprehensive scrambling than RXS M, but runs faster on 128-bit types. Although primarily intended for use at large sizes, also works at smaller sizes as well.” As well as the DXSM output permutation, pcg64_dxsm() uses a “cheap multiplier”, i.e. a 64-bit value half the width of the state, instead of a 128-bit value the same width as the state. The same multiplier is used for the LCG and the output permutation. The cheap multiplier improves performance: pcg64_dxsm() has fewer full-size 128 bit calculations.
(tags: pcg pcg64-dxsm rngs randomness algorithms performance random-numbers cryptography)
-
A thoughtful post from Bert Hubert, who is doing a good job on this side of things!
I and many of my friends are struggling to be, or at least feel, useful. Most of our professional opportunities are not particularly useful. If you are a ‘project lifecycle manager’ at a bland corporation, it can be hard to convince yourself you are achieving anything good for the world. […] Although there are many corporate jobs furthering inclusivity, sustainability and other worthy things, the work there largely consists of getting certifications or having people do the right kind of training. Often very little actual sustainability or inclusion is going on, and even if there is, your role in such a department is pretty far away from the action. But, unlike the project lifecycle manager, you can at least tell yourself your efforts are intended towards creating a better world. But, back to our challenge: how can we be useful, how can we try to contribute to at least trying to make things better? Because things aren’t looking that great for climate, societies, peace and democracies worldwide.
(tags: being-useful usefulness jobs work life career bert-hubert society)
-
Interesting aspect of behaviour, from an interview with Pete Lunn, the head of the Behavioural Research Unit at the Economic and Social Research Institute (ESRI):
“Status quo bias is a little bit different, it’s quite fascinating actually. It sounds like a fancy piece of academic language to say that people don’t like change, and there’s a bit of truth in that, but it’s more subtle than that, he said. “It’s like this — if you say to somebody ‘We’re going to change the way your town is laid out, we’re going to make it more friendly for pedestrians and cyclists,’ let’s say and you say there’s a plan to do it. A lot of people instinctually resist that. Actually, these sorts of policies are typically fairly popular but there’s a substantial minority who will really quite resist it,” he said. Lunn said: “If instead of telling them that it is a plan you say ‘oh, there is this town that has this layout, do you like it or not?’, you get completely different responses. It is as if when something is a plan for change we instinctually, psychologically react to it more negatively.” He said that if somebody else is proposing a plan some people will look for the negatives while they are less likely to do so if they are being asked a question in a more open way.
(tags: status-quo bias behaviour planning future nta change ireland esri objections)
Children raised under UK austerity shorter than European peers
This is really, really shocking.
Experts have said a poor national diet and cuts to the NHS are to blame. But they have also pointed out that height is a strong indicator of general living conditions, including illness and infection, stress, poverty and sleep quality.
The amount of damage the Tories have done to the UK in 10 years is staggering.(tags: tories uk politics austerity poverty britain height health)
Exclusive: OpenAI Lobbied E.U. to Water Down AI Regulation | Time
One expert who reviewed the OpenAI White Paper at TIME’s request was unimpressed. “What they’re saying is basically: trust us to self-regulate,” says Daniel Leufer, a senior policy analyst focused on AI at Access Now’s Brussels office. “It’s very confusing because they’re talking to politicians saying, ‘Please regulate us,’ they’re boasting about all the [safety] stuff that they do, but as soon as you say, ‘Well, let’s take you at your word and set that as a regulatory floor,’ they say no.”
(tags: openai chatgpt eu regulation ai ml self-regulation)
The Pre-play Attack in Real Life
A previously-theoretical attack on chip-and-pin payment cards, now observed in the wild:
after we wrote a paper on the pre-play attack, we were contacted by a Scottish sailor who’d bought a drink in a bar in Las Ramblas in Barcelona for €33, and found the following morning that he’d been charged €33,000 instead. The bar had submitted ten transactions an hour apart for €3,300 each, and when we got the transaction logs it turned out that these transactions had been submitted through three different banks. What’s more, although the transactions came from the same terminal ID, they had different terminal characteristics. When the sailor’s lawyer pointed this out to Lloyds Bank, they grudgingly accepted that it had been technical fraud and refunded the money.
(tags: fraud chip-and-pin payment banking credit-cards security pre-play-attack exploits)
-
Some history of the early Irish web, including yours truly, setting up the second server in Ireland in June 1993
CircleCI Engineering Competency Matrix
CircleCI have done a good bit of work on defining competency levels in an engineering organization here
(tags: career circleci engineering growth management competencies work)
-
Amazing! Pixel art (and a font) from a French embroidery book, printed in 1527
(tags: ancient pixel-art fonts graphics 1500s history embroidery)
-
“A degenerative learning process where [LLM] models start forgetting improbable events over time, as the model becomes poisoned with its own projection of reality” — this may be a serious problem for LLMs trained on the whole internet, rather than curated subsets, as the quantity of LLM-generated text in their training data increases.
(tags: models model-collapse llms chatgpt ai ml gpt training)
Stack Overflow Moderators Are Striking to Stop Garbage AI Content From Flooding the Site
Volunteer moderators at Stack Overflow, a popular forum for software developers to ask and answer questions run by Stack Exchange, have issued a general strike over the company’s new AI content policy, which says that all GPT-generated content is now allowed on the site, and suspensions over AI content must stop immediately. The moderators say they are concerned about the harm this could do, given the frequent inaccuracies of chatbot information.
(tags: garbage ai stack-overflow enshittification ml)
-
I missed this attack at the time, but Cory Doctorow reposted it recently — poisoning a neural network’s model trained using stochastic gradient descent by attacking the _ordering_ of the training data.
Suppose for example a company or a country wanted to have a credit-scoring system that’s secretly sexist, but still be able to pretend that its training was actually fair. Well, they could assemble a set of financial data that was representative of the whole population, but start the model’s training on ten rich men and ten poor women drawn from that set – then let initialisation bias do the rest of the work. Does this generalise? Indeed it does. Previously, people had assumed that in order to poison a model or introduce backdoors, you needed to add adversarial samples to the training data. Our latest paper shows that’s not necessary at all. If an adversary can manipulate the order in which batches of training data are presented to the model, they can undermine both its integrity (by poisoning it) and its availability (by causing training to be less effective, or take longer). This is quite general across models that use stochastic gradient descent.
(tags: attacks exploits training sgd security via:cory-doctorow neural-networks)
-
Nice exploit of LLM confabulation: ask LLM for coding advice, get a nonexistent package, then register that package and exploit other coders attempting to follow the LLM’s terrible advice
(tags: ai malware coding llms chatgpt hallucination confabulation fail infosec security exploits)
Kottke’s 2023 Father’s Day Gift Guide
There’s actually some fantastic ideas in here!
(tags: gifts ideas fathers-day presents stuff)
-
A fascinating queueing theory phenomenon:
In public transport, bus bunching, clumping, convoying, piggybacking or platooning is a phenomenon whereby two or more [buses] which were scheduled at regular intervals along a common route instead bunch together and form a platoon. This occurs when leading vehicles are unable to keep their schedule and fall behind to such an extent that trailing vehicles catch up to them. […] A bus that is running slightly late will, in addition to its normal load, pick up passengers who would have taken the next bus if the first bus had not been late. These extra passengers delay the first bus even further. In contrast, the bus behind the late bus has a lighter passenger load than it otherwise would have, and may therefore run ahead of schedule.
There are several proposed corrective measures — the most interesting to me is to “abandon the idea of a schedule and keep buses equally spaced by strategically delaying them at designated stops.” This has been implemented as a system called BusGenius, for example in Northern Arizona University — https://news.nau.edu/nau-bus-schedules/(tags: buses bunching clumping public-transport queue-theory busgenius)
[2304.11082] Fundamental Limitations of Alignment in Large Language Models
An important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users. This is usually achieved by tuning the model in a way that enhances desired behaviors and inhibits undesired ones, a process referred to as alignment. In this paper, we propose a theoretical approach called Behavior Expectation Bounds (BEB) which allows us to formally investigate several inherent characteristics and limitations of alignment in large language models. Importantly, we prove that for any behavior that has a finite probability of being exhibited by the model, there exist prompts that can trigger the model into outputting this behavior, with probability that increases with the length of the prompt. This implies that any alignment process that attenuates undesired behavior but does not remove it altogether, is not safe against adversarial prompting attacks. Furthermore, our framework hints at the mechanism by which leading alignment approaches such as reinforcement learning from human feedback increase the LLM’s proneness to being prompted into the undesired behaviors. Moreover, we include the notion of personas in our BEB framework, and find that behaviors which are generally very unlikely to be exhibited by the model can be brought to the front by prompting the model to behave as specific persona. This theoretical result is being experimentally demonstrated in large scale by the so called contemporary “chatGPT jailbreaks”, where adversarial users trick the LLM into breaking its alignment guardrails by triggering it into acting as a malicious persona. Our results expose fundamental limitations in alignment of LLMs and bring to the forefront the need to devise reliable mechanisms for ensuring AI safety.
(via Remmelt Ellen)(tags: papers ethics llms ai ml infosec security prompt-hacking exploits alignment)
-
A protein powder made from renewable electricity, requiring virtually no land, with a tiny carbon footprint, and resilient to climate or ecosystem shocks, unlike conventional agriculture. Apparently the resulting powder tastes nutty and a little like turmeric. Basically it ferments a type of airborne microbe, in a process that is 20x more efficient than photosynthesis, and 200x more than meat protein. They claim it to be “highly nutritious, vegan, and catering to every diet around. The macronutrient composition of the cells is very similar to that of dried soy or algae, but it is more versatile since it has pleasant note of umami flavor and mild aroma.” Also ideal for space! (Via Hannah Daly)
(tags: solein protein food climate fermentation)
Xandr’s online-ads segment list
“From “Heavy Purchasers” of Pregnancy Tests to the Depression-Prone: We Found 650,000 Ways Advertisers Label You” – The Markup:
If you spend any time online, you probably have some idea that the digital ad industry is constantly collecting data about you, including a lot of personal information, and sorting you into specialized categories so you’re more likely to buy the things they advertise to you. But in a rare look at just how deep—and weird—the rabbit hole of targeted advertising gets, The Markup has analyzed a database of 650,000 of these audience segments, newly unearthed on the website of Microsoft’s ad platform Xandr. The trove of data indicates that advertisers could also target people based on sensitive information like being “heavy purchasers” of pregnancy test kits, having an interest in brain tumors, being prone to depression, visiting places of worship, or feeling “easily deflated” or that they “get a raw deal out of life.”
(Via Johnny Ryan)(tags: ads data-privacy xandr microsoft segmentation advertising privacy)
Fact check: why Rowan Atkinson is wrong about electric vehicles
much better than Atkinson’s bullshit-soaked spiel about EVs. Don’t listen to washed-out comedians when you need science
(tags: environment business energy cars driving evs carbon sustainability)
-
“a place where those of us in the Restarters community with experience and skills in mending appliances and gadgets can share them with those who are starting out, or whose own knowledge lies in different areas.” Lots of good tips on general appliance repair and maintenance.
(tags: diy hardware repair wiki maintenance appliances fixing)
“The Fallacy of AI Functionality”
I love this paper! I’ve been saying this for years:
Deployed AI systems often do not work. They can be constructed haphazardly, deployed indiscriminately, and promoted deceptively. However, despite this reality, scholars, the press, and policymakers pay too little attention to functionality. This leads to technical and policy solutions focused on “ethical” or value-aligned deployments, often skipping over the prior question of whether a given system functions, or provides any benefits at all. To describe the harms of various types of functionality failures, we analyze a set of case studies to create a taxonomy of known AI functionality issues. We then point to policy and organizational responses that are often overlooked and become more readily available once functionality is drawn into focus. We argue that functionality is a meaningful AI policy challenge, operating as a necessary first step towards protecting affected communities from algorithmic harm.
One mastodon user notes: “My favorite (sarcasm) example of this was police departments buying ML for identifying gunshots. The models were all trained for earthquakes, and the vendor basically repurposed earthquake detection as gunshot detection, made bank, and left departments with a flood of false positives.”(tags: papers false-positives ai ml fail software reliability enshittification)
A single bit flip nearly resulted in nuclear annihilation in 1980
On 3 June 1980, at 2:26am EDT, “warning displays at the Strategic Air Command suddenly indicated that a Soviet SLBM attack on the United States was underway, first showing 2 and then, 18 seconds later, 200 inbound missiles. SAC ordered all alert air crews to start their engines.” “A subsequent investigation traced the cause to a defective 46¢ integrated circuit in a NORAD communications multiplexer, which sent test messages on dedicated lines from NORAD to other command posts. The test messages were designed to confirm those lines were functioning properly 24/7, and they were formatted to resemble an actual missile attack warning, including its size. The false alarm was triggered when the defective circuit randomly inserted 2’s in place of 0’s.” I wonder how many other near-armageddon incidents were barely averted…
(tags: nukes armageddon 1980s bit-flips errors testing norad sac usa)
Carbon aware temporal shifting of Kubernetes workloads using KEDA
“The Carbon Aware KEDA Operator was announced by Microsoft in April this year; … The operator builds on top of KEDA (Kubernetes Event Driven Autoscaling). Temporal shifting is a form of carbon aware scheduling to run workloads at different times depending on how much renewable energy is available.”
(tags: carbon co2 keda k8s scheduling ops scaling autoscaling microsoft sustainability)
Kaspersky reports new targeted malware on iOS
They are dubbing it “Triangulation”:
We believe that the main reason for this incident is the proprietary nature of iOS. This operating system is a “black box” in which spyware like Triangulation can hide for years. Detecting and analyzing such threats is made more difficult by Apple’s monopoly of research tools, making it the perfect haven for spyware. In other words, as I have said more than once, users are given the illusion of security associated with the complete opacity of the system. What actually happens in iOS is unknown to the cybersecurity experts.
(tags: ios malware infosec security kaspersky triangulation)
Chemical found in widely used sweetener breaks up DNA
Sucralose, as used in Splenda, is genotoxic. big yikes
(tags: genotoxic sucralose sweeteners additives soft-drinks junk-food food health)
“Data protection IS AI regulation”
The FTC have proposed a judgement against Amazon/Ring: “FTC says Ring employees illegally surveilled customers, failed to stop hackers from taking control of users’ cameras. Under proposed order, Ring will be prohibited from profiting from unlawfully collected consumer videos, pay $5.8M in consumer refunds.” Meredith Whittaker on Twitter, responding: “Speaking of real AI regulation grounded in reality! The part about Amazon being “prohibited from profiting from unlawfully collected consumer videos” is huge. Data protection IS AI regulation. & in this case will likely mean undoing datasets, retraining/disposing of models, etc.” Retraining/discarding datasets is a HUGE deal for AI/ML companies. This is the big stick for regulators. I hope the EU DPCs are paying attention to this judgement.
(tags: regulation ai ml training data-protection privacy ring amazon ftc)
-
New fast food frankenstein dish just dropped:
a fast food dish created in 2003 in the Dutch city of Rotterdam, consisting of a layer of french fries placed into a disposable metal take-away tray, topped with döner or gyro meat, covered with slices of Gouda cheese, and heated in an oven until the cheese melts. Then a layer of shredded iceberg lettuce is added, dressed with garlic sauce and sambal, a hot sauce from Indonesia .. The term kapsalon is Dutch for “hairdressing salon” or barber shop, alluding to one of the inventors of the dish who worked as a hairdresser.
This sounds delicious.
-
“The Story of Mel” is a legendary USENET story of “Mel”, a Real Programmer from back in the day, performing a truly impressive piece of optimization; a “paean to seat-of-the-pants machine coding”, as Micheal puts it. This site is a little shrine to Mel’s life and history from a MeFi user. (Via Meehawl)
(tags: mefi hacks mel usenet history computing-history via:meehawl machine-code)
Why the United States should prioritize autonomous demining technology
Excellent “AI for good” idea from the Bulletin of the Atomic Scientists:
Investments in and development of technologies for autonomous demining operations, post war, are long overdue and consistent with the White House’s push for a Blueprint for an AI Bill of Rights, which vows to use autonomy for the public good. Alas, while the Defense Department has pursued autonomous systems for the battlefield and the unincentivized private sector has focused on producing dancing robotic dogs, efforts to develop autonomous demining technology have stagnated. The United States should provide funding to energize those efforts, regardless of what decision is made in regard to sending cluster bombs to Kiev.
AI Hiring and Ghost Jobs Are Making the Job Search, Labor Market Weird
The AI enshittification continues:
Job seekers may virtually interview with or be prescreened by an artificial-intelligence program such as HireVue, Harver, or Plum. After someone applies to a job at a company that uses this software, they may receive an automated survey asking them to answer inane personality-assessment questions like “Which statement describes you best? (a) I love debating academic theories or (b) I adopt a future emphasis.” […] And these AI-moderated processes might not be fair, either. Researchers at the University of California, Berkeley, say that AI decision-making systems could have a 44% chance of being embedded with gender bias, a 26% chance of displaying both gender and race bias, and may also be prone to screening out applicants with disabilities. In one notorious case, an audit of an AI screening tool found that it prioritized candidates who played high-school lacrosse or were named “Jared.”
(tags: jared ai enshittification future jobs work hirevue harver plum ghost-jobs hiring)
Erasure Coding versus Tail Latency – Marc’s Blog
A very neat trick via Marc Brooker to improve tail latencies using erasure coding: ‘Say I have an in-memory cache of objects. I can keep any object in the cache once, and always go looking for it in that one place (e.g. with consistent hashing). If that place is slow, overloaded, experiencing packet loss, or whatever, I’ll see high latency for all attempts to get that object. With hedging I can avoid that, if I store the object in two places rather than one, at the cost of doubling the size of my cache. But what if I wanted to avoid the slowness and not double the size of my cache? Instead of storing everything twice, I could break it into (for example) 5 pieces .. encoded in such a way that I could reassemble it from any four pieces .. . Then, when I fetch, I send five get requests, and have the whole object as soon as four have returned. The overhead here on requests is 5x, on bandwidth is worst-case 20%, and on storage is 20%. The effect on tail latency can be considerable.’
(tags: architecture cache storage tail-latencies performance marc-brooker lambda erasure-coding algorithms latency)
Container Loading in AWS Lambda
Some lovely details in this writeup of a new system in AWS Lambda, via Marc Brooker:
This system gets performance by doing as little work as possible (deduplication, caching, lazy loading), and then gets resilience by doing slightly more work than needed (erasure coding, salted deduplication, etc). This is a tension worth paying attention to in all system designs.
(tags: architecture aws lambda marc-brooker performance storage caching containers caches)
Paper recommending continuing COVID-19 vaccination for kids
tl;dr: vaccination of kids is worth it to protect against Long Covid and hospitalisation. “A Methodological Framework for Assessing the Benefit of SARS-CoV-2 Vaccination following Previous Infection: Case Study of Five- to Eleven-Year-Olds”, Christina Pagel et al.:
We present a novel methodological framework for estimating the potential benefits of COVID-19 vaccination in previously infected children aged five to eleven, accounting for waning. We apply this framework to the UK context and for two adverse outcomes: hospitalisation related to SARS-CoV-2 infection and Long Covid. We show that the most important drivers of benefit are: the degree of protection provided by previous infection; the protection provided by vaccination; the time since previous infection; and future attack rates. Vaccination can be very beneficial for previously infected children if future attack rates are high and several months have elapsed since the previous major wave in this group. Benefits are generally larger for Long Covid than hospitalisation, because Long Covid is both more common than hospitalisation and previous infection offers less protection against it. Our framework provides a structure for policy makers to explore the additional benefit of vaccination across a range of adverse outcomes and different parameter assumptions. It can be easily updated as new evidence emerges.
(tags: vaccines vaccination covid-19 sars-cov-2 modelling long-covid uk)
EU hits Meta with record €1.2B privacy fine
The EDPB finally had to step in and override the pet regulator, our DPC. Here’s the big problem though:
Meta also has until November 12 to delete or move back to the EU the personal data of European Facebook users transferred and stored in the U.S. since 2020 and until a new EU-U.S. deal is reached.
This is going to be technically infeasible given Meta’s architecture, so the next question is, what happens when they fail to do it…(tags: meta facebook dpc edpb data-protection data-privacy eu us fines)
-
“dark testing”, live in production, to a separate test domain. Great way to gather some real-world data. Latencies are appreciably better, particularly for low-quality connections
(tags: dropbox http3 http2 http protocols udp networking ip testing)
My students are using AI to cheat. Here’s why it’s a teachable moment
One of the reasons so many people suddenly care about artificial intelligence is that we love panicking about things we don’t understand. Misunderstanding allows us to project spectacular dangers on to the future. Many of the very people responsible for developing these models (who have enriched themselves) warn us about artificial intelligence systems achieving some sort of sentience and taking control of important areas of life. Others warn of massive job displacement from these systems. All of these predictions assume that the commercial deployment of artificial intelligence actually would work as designed. Fortunately, most things don’t. That does not mean we should ignore present and serious dangers of poorly designed and deployed systems. For years predictive modeling has distorted police work and sentencing procedures in American criminal justice, surveilling and punishing Black people disproportionately. Machine learning systems are at work in insurance and health care, mostly without transparency, accountability, oversight or regulation. We are committing two grave errors at the same time. We are hiding from and eluding artificial intelligence because it seems too mysterious and complicated, rendering the current, harmful uses of it invisible and undiscussed. And we are fretting about future worst-case scenarios that resemble the movie The Matrix more than any world we would actually create for ourselves. Both of these habits allow the companies that irresponsibly deploy these systems to exploit us. We can do better. I will do my part by teaching better in the future, but not by ignoring these systems and their presence in our lives.
-
Synchronize multiple Pi-hole instances; it basically runs the standard backup API on the primary instance, then restores that config to the secondary, ensuring it constantly stays in sync
(tags: pi-hole high-availability home dns synchronization ops)
StanzaSystems/awesome-load-management
“A repo of links to articles, papers, conference talks, and tooling related to load management in software services: loadshedding, circuitbreaking, quota management and throttling. PRs welcome.” (via Niall Murphy)
(tags: load-shedding circuit-breakers quota-management throttling papers architecture patterns via:niallm)
-
Sounds like a massive improvement in operational management of MySQL fleets. Here’s hoping AWS might steal a few ideas for RDS
(tags: mysql ops replicas replication raft distributed-systems meta)
Inside DataDog’s $5M Outage (Real-World Engineering Challenges #8)
Reading between the lines: Ubuntu unattended-upgrades were left enabled, and as a result a fix which required a full reboot was rolled out swiftly and globally, including a key fleet of network control hosts, regardless of any normal deployment phasing rules. This broke all regions and AZs within a 1 hour period. whoopsie
(tags: postmortem datadog ubuntu fail unattended-upgrades systemd bugs availability outages)
-
I don’t use either service, but this is actually an excellent writeup of some high-end performance optimization on modern Linux EC2-based systems with NVMe SSDs, and the benchmarking of same
(tags: kafka redpanda benchmarks ec2 aws ssd optimization performance ops)
Never Give Artificial Intelligence the Nuclear Codes
Something new to worry about — giving an AI the keys to the nukes:
Any country that inserts AI into its [nuclear] command and control will motivate others to follow suit, if only to maintain a credible deterrent. Michael Klare, a peace-and-world-security-studies professor at Hampshire College, has warned that if multiple countries automate launch decisions, there could be a “flash war” analogous to a Wall Street “flash crash.” Imagine that an American AI misinterprets acoustic surveillance of submarines in the South China Sea as movements presaging a nuclear attack. Its counterstrike preparations would be noticed by China’s own AI, which would actually begin to ready its launch platforms, setting off a series of escalations that would culminate in a major nuclear exchange.
(tags: ai command-and-control nuclear-war nuclear flash-war)
-
Common misconceptions about swap memory on Linux systems:
Swap is a useful tool to allow equality of reclamation of memory pages, but its purpose is frequently misunderstood, leading to its negative perception across the industry. If you use swap in the spirit intended, though – as a method of increasing equality of reclamation – you’ll find that it’s a useful tool instead of a hindrance. Disabling swap does not prevent disk I/O from becoming a problem under memory contention, it simply shifts the disk I/O thrashing from anonymous pages to file pages. Not only may this be less efficient, as we have a smaller pool of pages to select from for reclaim, but it may also contribute to getting into this high contention state in the first place.
(via valen) -
handy web tool to figure out if a quote for a domestic solar PV install in Ireland is cheap, on the money, or too pricey
Coinbase spent $65M on Datadog
in one year — Sixty. Five. Million. Dollars.
eSIMs for data roaming in the US
$42 for 20GB of 5G/4G LTE data, and can provide a mobile hotspot for other devices. Looks like a decent enough deal for EU travellers visiting the US, where low-cost data roaming isn’t available (via ITC Slack)
-
“Magical shell history”:
Atuin replaces your existing shell history with a SQLite database, and records additional context for your commands. Additionally, it provides optional and fully encrypted synchronisation of your history between machines, via an Atuin server.
(via Nelson) vapes aren’t 95% less harmful than cigarettes
Debunking this common misconception around e-cigarettes
(tags: cigarettes vapes smoking vaping health debunking fact-checks)
Will A.I. Become the New McKinsey?
Great stuff from Ted Chiang:
A former McKinsey employee has described the company as “capital’s willing executioners”: if you want something done but don’t want to get your hands dirty, McKinsey will do it for you. That escape from accountability is one of the most valuable services that management consultancies provide. Bosses have certain goals, but don’t want to be blamed for doing what’s necessary to achieve those goals; by hiring consultants, management can say that they were just following independent, expert advice. Even in its current rudimentary form, A.I. has become a way for a company to evade responsibility by saying that it’s just doing what “the algorithm” says, even though it was the company that commissioned the algorithm in the first place. The question we should be asking is: as A.I. becomes more powerful and flexible, is there any way to keep it from being another version of McKinsey?
(tags: ai capitalism mckinsey future politics ted-chiang)
f3write, f3read – test real flash memory capacity
‘F3 (Fight Flash Fraud or Fight Fake Flash) tests the full capacity of a flash card (flash drive, flash disk, pendrive). It writes to the card and then checks if it can read it. It will assure you haven’t been sold a card with a smaller capacity than stated.’
(tags: f3read f3write f3 sd-cards flash memory storage testing hardware via:bigbro)
The Wide Angle: Understanding TESCREAL — Silicon Valley’s Rightward Turn
As you encounter these ideologies [Transhumanism, Extropianism, Singularitarianism, Cosmism, Rationalism, Effective Altruism, and Longtermism] in the wild, you might use the TESCREAL lens, and its alignment with Eurasianism and Putin’s agenda, to evaluate them, and ask whether they tend to undermine or enhance the project of liberal democracy. TESCREAL ideologies tend to advance an illiberal agenda and authoritarian tendencies, and it’s worth turning a very critical eye towards them, especially in cases where that’s demonstrably true. Clearly there are countless well-meaning people trying to use technology and reason to improve the world, but that should never come at the expense of democratic, inclusive, fair, patient, and just governance. The biggest risk AI poses right now is that alarmists will use the fears surrounding it as a cudgel to enact sweeping policy reforms. We should resist those efforts. Now more than ever, we should be guided by expertise, facts, and evidence as we seek to use technology in ways that benefit everyone.
(tags: ideology future tescreal ea longtermism ai politics silicon-valley)
heightened risk of autoimmune diseases after Covid
More evidence of a “substantially increased risk of developing a diverse spectrum of new-onset autoimmune diseases”:
Previously we knew there were many features of autoimmunity engendered by Covid, but the link to manifesting important autoimmune diseases has not been established. There are still many dots not connected—it’s fuzzy. We need to better understand how the dysregulation of our immune system that can occur from a Covid infection (or even more rarely from a vaccine) can be linked with a serious autoimmune condition. While we’ve fully recognized that people with autoimmune diseases are more vulnerable to Covid and adverse outcomes, the flip of that — that Covid can make some people vulnerable to autoimmune diseases — is what’s new.
(from the always excellent Eric Topol.)(tags: covid-19 long-covid pasc autoimmune diseases health medicine research eric-topol)
In a small study, an AI ‘brain decoder’ inches toward reading minds
In a new Nature Neuroscience paper published Monday, Huth and a team of researchers from the University of Texas at Austin introduced a new “brain decoder” enabled by GPT-1, an earlier version of the artificial neural network technology that underpins ChatGPT. After digesting several hours of training data, the new tool was able to describe the gist of stories the three participants in the proof-of-concept experiment listened to — just by looking at their functional MRI scans.
Very cool stuff. And I am happy to see the ethical considerations have been considered:“It is important to constantly evaluate what the implications are of new brain decoders for mental privacy,” said Jerry Tang, a Ph.D. candidate in Huth’s lab and lead author on the paper, in a press briefing. In devising ways to protect privacy, the authors asked participants to try to prevent the decoder from reconstructing the words they were hearing several different ways. Particularly effective methods included mentally listing off animals, and telling a different story at the same time the podcast was playing were particularly effective at stopping the decoder, said Tang. The authors also found that the decoder had to be trained on each subject’s data and wasn’t effective when used on another person. Between these findings and the fact that any movement would make the fMRI scans worse, the authors concluded that it’s not currently possible for a brain decoder to be used on someone against their will.
-
“A High School Teacher’s Free Image Database Powers AI Unicorns”:
To build LAION, founders scraped visual data from companies such as Pinterest, Shopify and Amazon Web Services — which did not comment on whether LAION’s use of their content violates their terms of service — as well as YouTube thumbnails, images from portfolio platforms like DeviantArt and EyeEm, photos from government websites including the US Department of Defense, and content from news sites such as The Daily Mail and The Sun. If you ask Schuhmann, he says that anything freely available online is fair game. But there is currently no AI regulation in the European Union, and the forthcoming AI Act, whose language will be finalized early this summer, will not rule on whether copyrighted materials can be included in big data sets. Rather, lawmakers are discussing whether to include a provision requiring the companies behind AI generators to disclose what materials went into the data sets their products were trained on, thus giving the creators of those materials the option of taking action. […] “It has become a tradition within the field to just assume you don’t need consent or you don’t need to inform people, or they don’t even have to be aware of it. There is a sense of entitlement that whatever is on the web, you can just crawl it and put it in a data set,” said Abeba Birhane, a Senior Fellow in Trustworthy AI at Mozilla Foundation.
(tags: consent opt-in web ai ml laion training-data scraping)
Ask HN: Most interesting tech you built for just yourself?
Fantastic thread of hackers scratching their own itch (via SimonW)
(tags: via:simonw hacking projects hn hacks open-source)
informative Twitter thread on the LessWrong/rationalist/”AI risk”/effective altruism cult
“some people understand immediately when i try to explain what it was like to be fully in the grip of the yudkowskian AI risk stuff and some people it doesn’t seem to land at all, which is probably good for them and i wish i had been so lucky”. Bananas…
(tags: cults ai-risk yudkowski future rokos-basilisk lesswrong effective-altruism)
Introducing VirusTotal Code Insight: Empowering threat analysis with generative AI
Impressively, when these models are trained on programming languages, they can adeptly transform code into natural language explanations. […] Code Insight is a new feature based on Sec-PaLM, one of the generative AI models hosted on Google Cloud AI. What sets this functionality apart is its ability to generate natural language summaries from the point of view of an AI collaborator specialized in cybersecurity and malware. This provides security professionals and analysts with a powerful tool to figure out what the code is up to. At present, this new functionality is deployed to analyze a subset of PowerShell files uploaded to VirusTotal. The system excludes files that are highly similar to those previously processed, as well as files that are excessively large. This approach allows for the efficient use of analysis resources, ensuring that only the most relevant files (such as PS1 files) are subjected to scrutiny. In the coming days, additional file formats will be added to the list of supported files, broadening the scope of this functionality even further.
(via Julie on ITC Slack)(tags: virustotal analysis malware code reverse-engineering infosec security)
How Philly Cheesesteaks Became a Big Deal in Lahore, Pakistan
This is fascinating history:
An establishment with a legacy such as [The Lahore Gymkhana Club, founded in 1878 under British rule] needed to continue revamping itself and serve exclusive dishes for its high-end clientele. And the club, along with restaurants aspiring to serve continental food, was bolstered by a growing taste for a new ingredient in town: processed cheese. “Sandwiches gradually started becoming popular in the 1980s because of the [wider] availability of cheese and mushrooms,” says Chaudhry. Until the 1980s, processed cheese was largely imported, and its use was limited to the rich, who would frequent establishments such as the Gymkhana. As Lahori taste buds adapted to and appreciated cheese, production was initiated locally. Demand for cheeseburgers and sandwiches skyrocketed in the 1990s, with a growing number of Pakistanis who’d traveled to the U.S. aspiring to re-create offerings from various popular American chains. One of these is exceptionally familiar. Even today, online food groups in Pakistan are peppered with people asking the community where they can find a cheesesteak in Lahore “like the one at Pat’s.” Many of them post images of the cheesesteaks from the original shop at 9th and Passyunk.
(tags: food cheesesteaks philadelphia history pakistan lahore sandwiches)
“Nothing like this will be built again”
Charlie Stross visits the Advanced Gas-cooled Reactors at Torness nuclear power station:
The AGRs at Torness [in the UK] are not ordinary civil [nuclear] power reactors. Designed in the 1970’s, they were the UK’s bid to build an export-earning civil nuclear power system. They’re sensitive thoroughbreds, able to reach a peak conversion efficiency of 43% — that is, able to turn up to 43% of their energy output into electricity. By comparison, a PWR peaks at 31-32%. However, the PWRs have won the race for commercial success: they’re much, much, simpler. AGRs are like Concorde — technological marvels, extremely sophisticated and efficient, and just too damned expensive and complex for their own good. (You want complexity? Torness was opened in 1989. For many years thereafter, its roughly fifty thousand kilometres of aluminium plumbing made it the most complex and demanding piece of pipework in Europe. You want size? The multi-thousand ton reactor core of an AGR is bigger than the entire plant at some PWR installations.) It’s a weird experience, crawling over the guts of one of the marvels of the atomic age, smelling the thing (mostly machine oil and steam, and a hint of ozone near the transformers), all the while knowing that although it’s one of the safest and most energy-efficient civilian power reactors ever built it’s a a technological dead-end, that there won’t be any more of them, and that when it shuts down in thirty or forty years’ time this colossal collision between space age physics and victorian plumbing will be relegated to a footnote in the history books. “Energy too cheap to meter” it ain’t, but as a symbol of what we can achieve through engineering it’s hard to beat.
(tags: engineering nuclear-power agr history uk torness power plumbing)
The Toronto Recursive History Project
“This plaque was commemorated on October 10, 2018, commemorate its own commemoration. Plaques like this one are an integral part of the campaign to support more plaques like this one. By reading this plaque, you have made a valuable addition to the number of people who have read this plaque. To this day and up to the end of this sentence, this plaque continues to be read by people like yourself. Heritage Toronto 2018”
(tags: heritage toronto recursive plaque commemoration funny)
Palantir Demos AI to Fight Wars But Says It Will Be Totally Ethical Don’t Worry About It
This is a really atrocious idea:
Palantir also isn’t selling a military-specific AI or large language model (LLM) here, it’s offering to integrate existing systems into a controlled environment. The AIP demo shows the software supporting different open-source LLMs, including FLAN-T5 XL, a fine-tuned version of GPT-NeoX-20B, and Dolly-v2-12b, as well as several custom plug-ins. Even fine-tuned AI systems off the shelf have plenty of known issues that could make asking them what to do in a warzone a nightmare. For example, they’re prone to simply making things up, or “hallucinating.” GPT-NeoX-20B in particular is an open-source alternative to GPT-3, a previous version of OpenAI’s language model, created by a startup called EleutherAI. One of EleutherAI’s open-source models — fine-tuned by another startup called Chai — recently convinced a Belgian man who spoke to it for six weeks to kill himself. What Palantir is offering is the illusion of safety and control for the Pentagon as it begins to adopt AI. […] What AIP does not do is walk through how it plans to deal with the various pernicious problems of LLMs and what the consequences might be in a military context. AIP does not appear to offer solutions to those problems beyond “frameworks” and “guardrails” it promises will make the use of military AI “ethical” and “legal.”
(tags: palantir grim-meathook-future war llm aip military ai ethics)
-
More on yesterday’s img2dataset failure to support opt-in:
It isn’t “effective altruism” if you have to force people to comply with you.
Google Launched Bard Despite Major Ethical Concerns From Its Employees
“The staffers who are responsible for the safety and ethical implications of new products have been told not to get in the way or to try to kill any of the generative AI tools in development,” employees told Bloomberg. The ethics team is now “disempowered and demoralized,” according to former and current staffers. Before OpenAI launched ChatGPT in November 2022, Google’s approach to AI was more cautious and less consumer-facing, often working in the background of tools like Search and Maps. But since ChatGPT’s enormous popularity prompted a “code red” from executives, Google’s threshold for safe product releases has been lowered in an effort to keep up with its AI competitors.
(tags: google ai safety chatgpt bard corporate-responsibility)
Shitty behaviour around the img2dataset AI scraper
The author of this popular AI training data scraping tool doesn’t seem to understand consent and opt-in:
Letting a small minority [ie web publishers] prevent the large majority [AI users] from sharing their images and from having the benefit of last gen AI tool would definitely be unethical yes. Consent is obviously not unethical. You can give your consent for anything if you wish. It seems you’re trying to decide for million of other people without asking them for their consent.
In other words, “scraping your content without opt-in is better than denying access to your content for millions of potential future AI users”. An issue to implement robots.txt support has been languishing since 2021. Good arguments for blocking the img2dataset user agent in general…Why is British media so transphobic?
Aside from the weirdness of Mumsnet, I didn’t know about the influence of the mid-2000s skeptics movement:
While claiming to be the country’s foremost critical thinkers, the group was riddled with anti-humanities bias and a fetish for a certain kind of “science” that it held to reveal a set of immutable principles upon which the world was built with almost no regard whatsoever for interpretative analysis based on social or historical factors. Part of this mode of thinking was an especially reductivist biologism: the idea that there are immutable realities to be found in our DNA, and if we just paid enough attention to Science and stopped trying to split hairs and discover meaning over in the superfluous disciplines of the humanities, then everything would be much simpler. It’s precisely this kind of biological essentialism — which skirts dangerously close to eugenics — that leads people to think they can “debunk” a person’s claim to their gender identity, or that it should be subjected to rigorous testing by someone in a lab coat before we can believe the subject is who they say they are.
(tags: debunking scepticism skeptics history terfs uk uk-politics gender)