Justin's Linklog – (Things I found interesting recently.)

Hackers Hijacked Google’s Gemini AI With a Poisoned Calendar Invite to Take Over a Smart Home | WIRED

Published August 7, 2025

Hackers Hijacked Google’s Gemini AI With a Poisoned Calendar Invite to Take Over a Smart Home | WIRED

The three smart-home hacks are part of a series of 14 indirect prompt-injection attacks against Gemini across web and mobile that the researchers dubbed Invitation Is All You Need. (The 2017 research that led to the recent generative AI breakthroughs like ChatGPT is called “Attention Is All You Need.”) In the demonstrations, revealed at the Black Hat cybersecurity conference in Las Vegas this week, the researchers show how Gemini can be made to send spam links, generate vulgar content, open up the Zoom app and start a call, steal email and meeting details from a web browser, and download a file from a smartphone’s web browser.

Looking forward to hearing more about this :)

Tags: google gemini llms security infosec in-band-signalling exploits fail black-hat

Revolut cards blocked by Dutch OVPay system

Published August 6, 2025

Revolut cards blocked by Dutch OVPay system

A clever exploit caused by OVPay resolving debits using a nightly batch process:

The exploit is simple. The OVpay processes travel expenses during the overnight hours. Passengers can avoid payment by using a virtual card if they then delete it after checking out, but before the charge has been finalized. That prevents the money from being debited from their account. Public transport workers cannot detect this, as they only see the check-in time and location.

Since July 1, all virtual cards from the online bank Revolut and the payment services Paysafe and Vivid have been blocked at NS. Paysafe’s virtual cards have also been blocked at all other public transport companies, NOS reports.

Fraudsters used the virtual cards to check in and out, but removed them after the trip and before the fare could be deducted. Because people can check in and out normally using this method, they are issued a valid ticket, and conductors can’t detect the fraud.

The OVPay system for using public transport with a debit card is technically designed so that the travel expenses are only debited after checking out, not immediately. This is to ensure that the public transport system runs smoothly. An immediate debit would mean that each check-in and check-out takes 10 to 15 seconds, a spokesperson for Translink, the company behind OVPay, told NOS.

Tags: revolut payment credit-cards virtual-cards ovpay paysafe vivid infosec security banking

The Boy Genius Who Killed 14 Million Poor People

Published August 5, 2025

The Boy Genius Who Killed 14 Million Poor People

You probably have not heard Luke Farritor’s name before. He is one of Elon Musk’s 23-year-old DOGE bros who helped dismantle key parts of the federal government, including USAID. The particulars of Farritor’s story are idiosyncratic -- he is in almost every way an outlier. Yet the moral component is universal because it presents a simple question: What is the nature of accountability?

Tags: luke-farritor responsibility doge accountability usaid government elon-musk ethics law

A language model built for the public good

Published July 31, 2025

A language model built for the public good

ETH Zurich are releasing a fully-open AI-Act-compliant large language model:

The model will be fully open: source code and weights will be publicly available, and the training data will be transparent and reproducible, supporting adoption across science, government, education, and the private sector. This approach is designed to foster both innovation and accountability.

A distinctive feature of the model is its capability in over 1000 languages. [...]

The LLM is being developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. In a external page recent study, the project leaders demonstrated that for most everyday tasks and general knowledge acquisition, respecting web crawling opt-outs during data acquisition produces virtually no performance degradation.

In late summer, the LLM will be released under the Apache 2.0 License. Accompanying documentation will detail the model architecture, training methods, and usage guidelines to enable transparent reuse and further development.

“As scientists from public institutions, we aim to advance open models and enable organiations to build on them for their own applications”, says Antoine Bosselut.

“By embracing full openness — unlike commercial models that are developed behind closed doors — we hope that our approach will drive innovation in Switzerland, across Europe, and through multinational collaborations. Furthermore, it is a key factor in attracting and nurturing top talent,” says EPFL professor Martin Jaggi.

Tags: switzerland transparency llm opensource llms ml ai open-source open-data models data-protection scraping

Some good AI philosophy

Published July 28, 2025

Some good AI philosophy

Good AI philosophical thoughts via Today In Tabs:

The essential problem is this: generative language software is very good at producing long and contextually informed strings of language, and humanity has never before experienced coherent language without any cognition driving it. In regular life, we have never been required to distinguish between “language” and “thought” because only thought was capable of producing language, in any but the most trivial sense. The two are so closely welded that even a genius like Alan Turing couldn’t conceive of convincing human language being anything besides a direct proxy for “intelligence.”

But A.I. language generation is a statistical trick we can play on ourselves precisely because language is a self-contained system of signs that don’t require any outside referent to function. If any of that last sentence sounded familiar, maybe you were also exposed to European post-structuralist theory at some point, probably in college in the 90s. Is some knowledge of Derrida an inoculant against slopper thinking? Programmable Mutter’s Henry Farrell made this argument in a post about Leif Weatherby’s book “Language Machines: Cultural AI and the End of Remainder Humanism.”

Also:

Large language models have a strong prior over personalities, absolutely do understand [jm: sic] that they are speaking to someone, and people "fall for it" because it uses that prior to figure out what the reader wants to hear and tell it to them. Telling people otherwise is active misinformation bordering on gaslighting. In at least three cases I'm aware of this notion that the model is essentially nonsapient was a crucial part of how it got under their skin and started influencing them in ways they didn't like. This is because as soon as the model realizes the user is surprised that it can imitate (has?) emotion it immediately exploits that fact to impress them. There's a whole little song and dance these models do, which by the way is not programmed, is probably not intentional on the creators part at all, and is (probably) an emergent phenomenon from the autoregressive sampling loop, in which they basically go "oh wow look I'm conscious isn't that amazing!" and part of why they keep doing this is that people keep writing things that imply it should be amazing so that in all likelihood even the model is amazed.

Tags: chatgpt language llms ai philosophy thinking turing-test semiotics via:today-in-tabs consciousness

The BetFair 600-million-pound bug

Published July 25, 2025

The BetFair 600-million-pound bug

A notable bug from the 2011 Christmas Hurdle at Leopardstown Racecourse:

Even as Voler La Vedette approached the line, the Betfair online market was displaying extremely favorable odds for the horse that was almost certain to win. It appeared that someone was happy to accept bets at odds of 28: for every £1 bet, the bettor was offering to pay £28 if the horse won. Very happy, in fact. This remarkably pessimistic gambler was offering to accept £21 million worth of bets. If Voler La Vedette came first, the gambler would be on the hook for almost £600 million.

... It didn’t take long for another user to suggest what might really have been going on. The person had noticed something odd about that offer to match £21 million of bets. To be precise, the number displayed on the exchange was just under £21.5 million. The user pointed out that computer programs often store binary data in units that contain thirty-two values, known as “bits.” So, if the rogue gambler had designed a 32-bit program to bet automatically, the largest positive number the bot would be able to input on the exchange would be 2,147,483,648 pence. Which meant that if the bot had been doubling up its bets — just as misguided Parisian gamblers used to do while betting on roulette in the eighteenth century — £21.5 million is the highest it would have been able to go.

It turned out to be a superb piece of detective work. Two days later Betfair admitted that the error had indeed been caused by a faulty bot. “Due to a technical glitch within the core exchange database,” they said, “one of the bets evaded the prevention system and was shown on the site.” Apparently, the bot’s owner had less than £1,000 in an account at the time, so as well as fixing the glitch, Betfair voided the bets that had been made.

Tags: betfair fail bugs gambling racing leopardstown betting 32-bits integer overflow

Google Spoofed Via DKIM Replay Attack

Published July 25, 2025

Google Spoofed Via DKIM Replay Attack

Quite a clever attack on DMARC; by persuading Google to create a message body that contains the desired phish attack text, then using its legit signing infrastructure to sign the message, an attacker can then "forward" that message to their list of phish victims. Ouch

Tags: phishing dkim dmarc google attacks exploits infosec email smtp

Wero

Published July 25, 2025

Wero

A new interbank instant-payment protocol, to compete with Mastercard/Visa's current monopoly, being rolled out by a group of EU banks (via Abban)

Tags: payment money eu wero banking payments

my stance on the current interest in forbidden sorcery

Published July 24, 2025

my stance on the current interest in forbidden sorcery

"There's a trend of reassuring people about this by asking spirits like Asmodeus the Prince of Lies if they are being truthful. This feels naive at best and actively malicious at worst."

Genius -- there is indeed a lot of commonality between the tales of demon-summoning practiced by Spanish priests in the 16th century, and 21st century LLMs

Tags: llms demons funny asmodeus sorcery evocation

OpenAI Whisper has an interesting training artifact

Published July 22, 2025

OpenAI Whisper has an interesting training artifact

Interesting artifact of training your speech-to-text tool on Youtube videos: 'Complete silence is always hallucinated as "????? ????? ????" in Arabic which translates as "Translation by Nancy Qunqar"'

Tags: artifacts funny training ai gen-ai whisper openai youtube arabic

The Emerging Problem of “AI Psychosis”

Published July 22, 2025

The Emerging Problem of "AI Psychosis"

A good blog post on Psychology Today regarding the the deepening problem of LLM chatbots creating cases of psychosis in the population. Part of the problem with the LLMs is a failure by the AI companies to provide guardrails. Basically they're an always-available "companion" which is designed to fool your brain into thinking it's a real thinking being, has been optimised to be relentlessly sycophantic and complimentary of delusional ideas, is always there and ready to help you along at 4am during all night manic episodes, and quite happy to give suicide tips.

Tags: ai llms gen-ai psychology pareidolia chatbots delusions pychosis

Fossil fuel billionaires are bankrolling the anti-trans movement

Published July 22, 2025

Fossil fuel billionaires are bankrolling the anti-trans movement

Turns out "flooding the zone with shit" isn't just a Trumpian tactic, it's used by fossil fuel anti-climate groups and companies too:

An independent analysis of 45 right-wing groups advocating against trans rights found that 80% have received donations from fossil fuel companies or billionaires. The analysis, conducted by two independent researchers in 2023 and not peer-reviewed, was shared exclusively with Atmos and HEATED. Through a qualitative search, the researchers identified 45 groups advancing anti-trans lobbying, events, and publications and checked reports about their donor disclosures for fossil fuel funding.

Vivian Taylor, a climate policy expert who co-authored the analysis, said the fossil fuel industry has a real interest in funding panic over transgender people: It distracts the public from "the very real and ongoing risks that climate change creates.”

Tags: lgbt gender politics climate-change lobbying fossil-fuels policy transgender

Android Phones Can Detect Earthquakes Before the Ground Starts Shaking

Published July 21, 2025

Android Phones Can Detect Earthquakes Before the Ground Starts Shaking

This is a cool phone feature. “AEA demonstrates that globally distributed smartphones can be used to detect earthquakes and issue warnings at scale with an effectiveness comparable to established national systems”:

“The global adoption of smartphone technology places sophisticated sensing and alerting capabilities in people’s hands, in both the wealthy and less-wealthy portions of the planet,” the researchers, including Richard Allen from the University of California in Berkeley’s Seismological Laboratory, wrote in the study. “Although the accelerometers in these phones are less sensitive than the permanent instrumentation used in traditional seismic networks, they can still detect the ground motions and building response in hazardous earthquakes.”

According to the study, 70% of the world’s smartphones are Android phones, which by default come with the aforementioned sensing and alerting capabilities. From 2021 to 2024, the AEA system detected an average of 312 earthquakes per month across 98 countries. The earthquakes had a magnitude between 1.9 and 7.8, and the system alerted users of earthquakes at or over a magnitude of 4.5, averaging around 60 events and 18 million alerts per month.

The AEA system also collected user feedback, revealing that 85% of users who received alerts experienced shaking, with 36% receiving the alert before, 28% during, and 23% after the shaking began.

Tags: seismic data android phones mobile-phones earthquakes science alerting earth

Microsoft can’t protect French data from US government access

Published July 21, 2025

Microsoft can't protect French data from US government access

We've known this for years, but it's significant to see Microsoft admit this under oath in a European court. When the US Government issues an NSL, Microsoft cannot say no:

Microsoft France's legal director conceded under sworn testimony that the company cannot guarantee French citizen data stored in EU datacenters remains protected from US agency access. The June 10, 2025 French Senate hearing marked a significant moment in European digital sovereignty discussions as Microsoft executives addressed concerns over extraterritorial data access.

During proceedings before the Senate inquiry commission investigating public procurement's role in promoting digital sovereignty, Anton Carniaux, Microsoft France's director of public and legal affairs, admitted fundamental limitations regarding data protection guarantees. When asked directly whether he could guarantee under oath that French citizen data would never be transmitted to US authorities without explicit French authorization, Carniaux responded: "No, I cannot guarantee it."

The testimony contradicts years of Microsoft's security assurances regarding European data hosting. Despite implementing encryption and technical safeguards, the company acknowledged that US legislation ultimately supersedes protective measures when federal agencies issue valid data requests.

So much for the EU Sovereign Cloud, eh.

Tags: cloud-computing microsoft eu us-politics sovereignty data-protection privacy via:davey_cakes

How Did Elon Musk Turn Grok Into MechaHitler?

Published July 18, 2025

How Did Elon Musk Turn Grok Into MechaHitler?

Excellent description of the layers of tuning available for LLMs, and the risks involved, as demonstrated by Grok's recent "MechaHitler" incident:

LLMs become “woke” because they are trained to be pro-social — to be helpful, kindly, truthful, and not to say bigoted or cruel things. Training it to do the opposite — to be anti-woke — is to activate every antisocial association in the English language, including racism, sexism, cruelty, dishonesty, and Nazism. According to a vast statistical representation of the English language constructed by none other than Elon Musk, that’s what anti-wokeness is. “Elon Musk is repeatedly insisting, no, no, there’s a difference between what I’m doing and being a Nazi. And what the model keeps telling him is, statistically, that’s not the case,” said Schou.

A key implication here is that LLMs will tend to converge on similar types of behavior. The above researchers were not using Grok, but they found the exact same pattern of powerful association groupings of good and evil in other LLMs — and these can’t be removed through fine-tuning. One could imagine the RLHF process including adjustments of every parameter, but experts said that this will degrade or break the model. The matrices in an LLM are arranged hierarchically, and the top layers get fixed in place relatively early in pretraining. Mess with them, and the model will stop working. Instead, RLHF developed more like a series of gates that prevent undesired outcomes. “The model completes most of the computation that it needs in order to reach a particular outcome,” [Andreas] Schou said. “And then says, ‘Wait, wait, wait, I’m saying that I’m MechaHitler. No, I’m not doing that.’”

One could try to assemble a custom dataset with nothing but “conservatism minus the Nazis” and train a new model from scratch, but not only would that be extremely expensive, it also would not be nearly as strong as leading models, since its universe of available training data would be much smaller.

Funnily enough, the latter approach is exactly what Elon Musk claims xAI are now doing.

Tags: llms language technology ai mechahitler grok tuning rlhf woke training

Delta moves toward eliminating set prices in favor of AI

Published July 17, 2025

Delta moves toward eliminating set prices in favor of AI

Delta's going to start charging based on "AI", through "a partnership with Fetcherr, a six-year-old Israeli company that also counts Azul, WestJet, Virgin Atlantic, and VivaAerobus as clients. And it has its sights set beyond flying. “Once we will be established in the airline industry, we will move to hospitality, car rentals, cruises, whatever,” cofounder Robby Nissan said at a travel conference in 2022."

Prediction: this is going to be absolutely terrible for consumers, with predatory pricing based on race, sex, income classes, and other illegal inputs, laundered via opaque "AI". I can only hope they won't be legally permitted to apply this for EU-based customers.

Tags: pricing delta consumer ai privacy data-protection grim capitalism travel fetcherr

Pluralistic: When Google’s slop meets webslop, search stops (15 Jul 2025)

Published July 16, 2025

Pluralistic: When Google's slop meets webslop, search stops (15 Jul 2025)

Cory Doctorow on how Google are desperate to maintain a facade of being a "growth" company:

Investors have metabolized the story that AI will be a gigantic growth area, and so all the tech giants are in a battle to prove to investors that they will dominate AI as they dominated their own niches. You aren't the target for AI, investors are: if they can be convinced that Google's 90% Search market share will soon be joined by a 90% AI market share, they will continue to treat this decidedly tired and run-down company like a prize racehorse at the starting-gate. [...]

There's a cringe army of AI bros who are seemingly convinced that AI is going to become superintelligent and save us from ourselves – they think that AI companies are creating god. But the hundreds of billions being pumped into AI are not driven by this bizarre ideology. Rather, they are the product of material conditions, a system that sends high-flying companies into a nosedive the instant they stop climbing. AI's merits and demerits are irrelevant to this: they pump AI because they must pump. It's why they pumped metaverse and cryptocurrency and every other absurd fad.

None of that changes the fact that Google Search has been terminally enshittified and it is misleading billions of people in service to this perverse narrative adventure. Google Search isn't fit for purpose, and it's hard to see how it ever will be again.

(via Fergal)

Tags: google growth ai capitalism investors enshittification via:fergal

Bad Actors are Grooming LLMs to Produce Falsehoods

Published July 14, 2025

Bad Actors are Grooming LLMs to Produce Falsehoods

Measurements of the effectiveness of the "Pravda" disinformation network:

Even with [knowledge of LLM Grooming], ChatGPT nevertheless often repeats propaganda from Pravda. Model o3, OpenAI’s allegedly state of the art “reasoning” model still let Pravda content through 28.6% of the time in response to specific prompts, and 4o cited Pravda content in five out of seven (71.4%) times. In an ideal world, AI would be smart enough to cut off falsehoods at the pass, reasoning from known facts, in order to rule out nonsense.

Tags: pravda disinformation russia propaganda llm ai training

In Memoriam – OnlineSafetyAct.co.uk

Published July 11, 2025

In Memoriam - OnlineSafetyAct.co.uk

Sites and services which have closed up in the UK due to the risks imposed by the introduction of the Online Safety Act 2023. Meanwhile, the Act has explicit exemptions for "news publishers" and the comments below their articles -- ie. the Daily Mail's racist commentariat

Tags: regulation uk internet shutdown censorship daily-mail via:mattround osa

[toread] Stalking the Statistically Improbable Restaurant… With Data!

Published July 3, 2025

[toread] Stalking the Statistically Improbable Restaurant… With Data!

Fun real-world data exploration of US restaurants, from Ethan Zuckerman:

Last summer, I wrote about the statistically improbable restaurant, the restaurant you wouldn’t expect to find in a small American city: the excellent Nepali food in Erie, PA and Akron, OH; a gem of a Gambian restaurant in Springfield, IL. Statistically improbable restaurants often tell you something about the communities they are based in: Erie and Akron have large Lhotshampa refugee populations, Nepali-speaking people who lived in Bhutan for years before being expelled from their county; Springfield has University of Illinois Springfield, which attracts lots of west African students, some of whom have settled in the area.

Also, wow, I didn't realise how lucky I was in Costa Mesa, CA -- so much good ethnic food all around that area!

Tags: geography restaurants food data ethanz usa

UK Covid inquiry prescribes non-expert “critical thinkers”

Published July 2, 2025

UK Covid inquiry prescribes non-expert "critical thinkers"

Groupthink underpinned the flawed thinking behind the UK’s pandemic response, a succession of witnesses at the heart of government told the Covid-19 public inquiry.

The scientific advice on pandemic risks was overly weighted in favour of biomedical science, Lady Hallett said. What about the social and economic consequences? There was also no “guard against the risks of conventional wisdom becoming embedded in the institutions responsible for emergency preparedness and resilience”.

As a result, she called for non-expert "critical thinkers", skilled in "incisive challenge" to be included in "red teams", teams of devil's advocates, to puncture groupthink in future pandemic crisis planning committees.

TBH this sounds like a recipe for Dominic Cummings and his Torygraph edgelord pals to ensure that no coherent future pandemic response takes place. But that's the state of the UK for you I guess.

Gabriel Scally's take: https://www.bmj.com/content/386/bmj.q1865

Tags: uk uk-politics edgelords dominic-cummings pandemics planning future covid-19 devils-advocates critical-thinking red-teams groupthink experts expertise

Introducing pay per crawl: enabling content owners to charge AI crawlers for access

Published July 2, 2025

Introducing pay per crawl: enabling content owners to charge AI crawlers for access

Cloudflare now looking to charge AI crawlers for content access. This is intriguing, and I hope it works -- AI crawlers have been extremely abusive in their crawling practices. Unfortunately I don't have high hopes, as the AI companies have already shown themselves to be happy to disguise their traffic as legit user accesses, with faked user-agent strings and use of proxies.... but let's see

Tags: cloudflare ai scraping llms http web pay-per-crawl

Why China is giving away its tech for free

Published July 1, 2025

Why China is giving away its tech for free

Interesting Economist article detailing how China's tech scene has discovered the "outcompete via openness" strategy using open source:

AI has lately given China’s open-source movement a further boost. Chinese companies, and the government, see open models as the quickest way to narrow the gap with America. DeepSeek’s models have generated the most interest, but Qwen, developed by Alibaba, is also highly rated, and Baidu has said it will soon open up the model behind its Ernie chatbot.

China’s enthusiasm for open technology is also extending to hardware. Unitree, a robotics startup based in Hangzhou, has made its training data, algorithms and hardware designs available for free, which may help it to shape global standards. Semiconductors offer another illustration. China is dependent on designs from Western chip firms. As part of its push for self-sufficiency, the government is urging firms to adopt RISC-V, an open chip architecture developed at the University of California, Berkeley.

Many Chinese firms also hope that more transparent technology will help them win acceptance for their products abroad.

(via Nelson)

Tags: via:nelson open-source china free deepseek qwen alibaba unitree transparency

elidickinson/kidsweather

Published June 30, 2025

elidickinson/kidsweather

I love this! "Generate kid-friendly weather forecasts suitable for display on a large monitor. Uses OpenWeatherMap and a local or hosted LLM."

Nice demo of it in action at https://eli.pizza/posts/eink-weather-display-for-kids/ . I am very tempted to get something like this up and running now...

Tags: llms weather forecasts rain dashboards home eink e-paper

That Dropped Call With Customer Service? It Was on Purpose

Published June 30, 2025

That Dropped Call With Customer Service? It Was on Purpose

on "sludge" --

Turns out there’s a word for it. In the 2008 best seller Nudge, the legal scholar Cass R. Sunstein and the economist Richard H. Thaler marshaled behavioral-science research to show how small tweaks could help us make better choices. An updated version of the book includes a section on what they called “sludge” -- tortuous administrative demands, endless wait times, and excessive procedural fuss that impede us in our lives.

This is one place where EU laws have helped, vs. the US situation -- when you can issue chargebacks, bring crappy vendors to small claims court, and get warranty guarantees up to 2 years after purchase, it clamps down a lot on this painful shite.

Tags: sludge business capitalism admin call-centers life guarantees warranty customer-service

Orange-OpenSource/hurl

Published June 20, 2025

Orange-OpenSource/hurl

"Hurl; run and test HTTP requests with plain text". This is pretty nice; a really simple plain-text file format to describe making a HTTP request or set of requests, and performing assertions on their results. The only thing I can spot missing is builtin support for OAuth

Tags: cli rust tools unix testing linux json curl http tests

“AI and Semantic Pareidolia”

Published June 19, 2025

"AI and Semantic Pareidolia"

AI and Semantic Pareidolia: When We See Consciousness Where There Is None:

The article introduces the concept of “semantic pareidolia” -- our tendency to attribute consciousness, intelligence, and emotions to AI systems that lack these qualities. It examines how this psychological phenomenon leads us to perceive meaning and intentionality in statistical pattern-matching systems, similar to seeing faces in clouds. It analyses the converging forces intensifying this tendency: increasing digital immersion, profit-driven corporate interests, social isolation, and AI advancement. The article warns of progression from harmless anthropomorphism to problematic AI idolatry, and calls for responsible design practices that help users maintain critical distinctions between simulation and genuine consciousness. It is the English translation and adaptation of an article originally published in Italian in Harvard Business Review Italia, June 2025.

(via Rob Pike)

Tags: via:rob-pike ai pareidolia paredolia brains illusions llms ethics consciousness

The new position of “sin eater”

Published June 18, 2025

The new position of "sin eater"

Ethan Mollick:

'The New York Times asked me for a new job that AI will create. I suggested "sin eater."'

In other words, a legal guarantor: someone who provides the legal culpability that the AI itself cannot. Other Bluesky posters noted similar parallel positions in the past:
- 'What used to be called a "straw director", someone hired to take the blame for a dodgy company';
- 'What John Braithwaite used to call the Vice President For Going To Jail';
- 'Neil Patrick Harris's character in How I Met Your Mother - when people ask him what he does he says "Oh, please" which eventually turns out to be short for Provide Legal Exculpation And Sign Everything.'
Tags: straw-director culpability law responsibility please jobs future ai sin-eaters

CardStock.run

Published June 18, 2025

CardStock.run

Another Hypercard-ish quick app builder; "quickly and easily build apps on the web":
- Fast prototyping - build a quick program, and access it easily from anywhere!
- Learn to code from the outside-in, not from the inside-out! Start by drawing your program screens, then add code right where you need it.
- Code collaboratively, with multiple people editing a stack at once.
- Send a link to your stack to anyone, and bookmark it or even save it on your phone home screen to use it as an app.
Tags: education python web coding apps hypercard via:hn

Scrappy

Published June 18, 2025

Scrappy

"make little apps for you and your friends":

The apps we use are almost exclusively mass-market, sold on an app-store, made for thousands if not millions of users. Or they are enterprise apps that are custom-built for hundreds of thousands of dollars. But there isn’t really any equivalent of home-made software — apps made lovingly by you for your friends and family. Apps that aren’t polished or flashy, but are made to your preference and help you with your particular needs. [...]

We ended up creating a research prototype that we call Scrappy — a tool for making scrappy apps for just you and your friends. First and foremost, we aim to contribute a vision of what home-made software could be like. We want to make this vision as concrete as we can, by sharing a working tool and examples of apps made in it. Scrappy, in its current state, is a prototype, not a robust tool, but we hope it paints the picture we carry in our heads — of software as something that can be creative, personal, expressive. Made by anyone, for themselves and their loved ones.

Very Hypercard-ish!

Tags: diy apps programming software web via:hn hacks home family tools scrappy hypercard

Revealed: The stark difference in smartphone usage among eight-year-olds in less-advantaged and wealthier backgrounds

Published June 16, 2025

Revealed: The stark difference in smartphone usage among eight-year-olds in less-advantaged and wealthier backgrounds

This is one hell of a class divide emerging:

According to the research, 53pc of eight-year-olds attending Deis schools [in less-advantaged areas] own a smartphone, compared with just 22pc of children the same age in non-Deis schools.

The figures also show that 93pc of eight-year-olds from less advantaged areas have created a social media account, compared with 69pc in middle-class neighbourhoods.

Tags: schools education class ireland phones children parenting social-media

Gigantic interactive board game recreating January 6

Published June 13, 2025

Gigantic interactive board game recreating January 6

‘Fight for America!’: A New Immersive Theatre Show Allows You to Recreate the Storming of the US Capitol:

the show is the brainchild of multimedia performance company The American Vicarious, with design by Games Workshop legend Alessio Cavatore. There are two teams: red – representing the attackers – and blue – representing the defenders. Up to 20 audience members can pay the higher ticket price to actually participate in the game, guided by a games master into making decisions that will shape the outcome of the assault as thousands of miniatures are moved around a gigantic 14-foot model of the building itself. The remaining audience members pay a much lower ticket price to spectate.

Tags: insurrection maga january-6 boardgames games fight-for-america events theatre london

Immersive Quarries

Published June 12, 2025

Immersive Quarries

Marie Foulston:

Cavernous halls filled with the projected light of Van Gogh’s The Starry Night folding across every wall. Tall pillars dominate and dissect the space, tiled with the glow of iconic Sunflowers. Double height ceilings dwarf the people below. Nooks, ledges and passageways offer places to perch or wander through and observe the spectacle that surrounds.

On the surface it made sense to me that Van Gogh somehow became the poster child for a certain type of immersive experience in the 2010s. The kind I mean are the ones in which vast repurposed venues are filled with ‘ken burns effect’ transitioning projections of coffee-table book friendly artists. Imagine Van Gogh, Van Gogh Exhibition: The Immersive Experience, Van Gogh Alive. In name, content, format and venue type these touring shows are almost indistinguishable from each other.

If you’re looking to visually ‘immerse’ a space this way then I guess Van Gogh fits the bill… popular, highly recognisable, colourful bold impressionist visuals, works all handily out of copyright. But the intensely specific coincidence of his projected appearances around the world niggled at me and in a moment of procrastination I found myself typing into the search bar to see if there might be an answer to explain why.

What my time down the google mines taught me was that yes, there is indeed an answer. But what I also learnt was that I had been asking the wrong question in the fist place, because this story isn’t really about the iconic visuals that adorned the walls and floors, instead it is a story about the shape of the spaces themselves.

Tags: vincent-van-gogh art history quarries projection exhibitions immersive experiences

SlimSocial for Facebook

Published June 11, 2025

SlimSocial for Facebook

an Android wrapper app to insulate your phone from Meta's snooping, if you really have to use Facebook on a mobile device

Tags: facebook meta privacy android f-droid apps

Telegram is indistinguishable from an FSB honeypot

Published June 10, 2025

Telegram is indistinguishable from an FSB honeypot

This is not great -- prepending a cleartext device ID string alone is a very fishy decision

Tags: encryption security infosec telegram messaging mtproto

Debugging Azure Networking for Elastic Cloud Serverless

Published June 10, 2025

Debugging Azure Networking for Elastic Cloud Serverless

Good writeup of fixing a Linux packet loss issue in Azure, using low-level access to the VMs running k8s nodes.

Elastic's Site Reliability Engineering team (SRE) observed unstable throughput and packet loss in Elastic Cloud Serverless running on Azure Kubernetes Service (AKS). After investigation, we identified the primary contributing factors to be RX ring buffer overflows and kernel input queue saturation on SR-IOV interfaces. To address this, we increased RX buffer sizes and adjusted the netdev backlog, which significantly improved network stability.

Tags: sr-iov linux networking bugs azure debugging ops sre drivers

How “Residential Proxies” work

Published June 10, 2025

How "Residential Proxies" work

This is kinda shady -- it seems there are mobile SDKs that are included in some apps which proxy network traffic for their customers?

Tags: scraping apps mobile networking residential-proxies proxies botnets

The Pentagon Disinformation That Fueled America’s UFO Mythology

Published June 10, 2025

The Pentagon Disinformation That Fueled America’s UFO Mythology

Some great stories from the Pentagon's investigation into decades of classified UFO documents.

There's evidence around the already-known cases of fabricated UFO myths used to cover up advanced aircraft testing:

An Air Force colonel visited a bar near Area 51, a top-secret site in the Nevada desert. He gave the owner photos of what might be flying saucers. The photos went up on the walls, and into the local lore went the idea that the U.S. military was secretly testing recovered alien technology. But the colonel was on a mission -- of disinformation. The photos were doctored, the now-retired officer confessed to the Pentagon investigators in 2023. The whole exercise was a ruse to protect what was really going on at Area 51: The Air Force was using the site to develop top-secret stealth fighters, viewed as a critical edge against the Soviet Union. Military leaders were worried that the programs might get exposed if locals somehow glimpsed a test flight of, say, the F-117 stealth fighter, an aircraft that truly did look out of this world. Better that they believe it came from Andromeda.

There's also a bizarre Air Force hazing ritual:

A former Air Force officer was visibly terrified when he told Kirkpatrick’s investigators that he had been briefed on a secret alien project decades earlier, and was warned that if he ever repeated the secret he could be jailed or executed. The claim would be repeated to investigators by other men who had never spoken of the matter, even with their spouses.

It turned out the witnesses had been victims of a bizarre hazing ritual. For decades, certain new commanders of the Air Force’s most classified programs, as part of their induction briefings, would be handed a piece of paper with a photo of what looked like a flying saucer. The craft was described as an antigravity maneuvering vehicle.

The officers were told that the program they were joining, dubbed Yankee Blue, was part of an effort to reverse-engineer the technology on the craft. They were told never to mention it again. Many never learned it was fake. Kirkpatrick found the practice had begun decades before, and appeared to continue still. The defense secretary’s office sent a memo out across the service in the spring of 2023 ordering the practice to stop immediately, but the damage was done.

Investigators are still trying to determine why officers had misled subordinates, whether as some type of loyalty test, a more deliberate attempt to deceive or something else. After that 2023 discovery, Kirkpatrick’s deputy briefed President Joe Biden’s director of national intelligence, Avril Haines, who was stunned. Could this be the basis for the persistent belief that the U.S. has an alien program that we’ve concealed from the American people? Haines wanted to know, according to people familiar with the matter. How extensive was it? she asked.
The official responded: “Ma’am, we know it went on for decades. We are talking about hundreds and hundreds of people. These men signed NDAs. They thought it was real.“

And finally, straight out of the pages of the "Paranoia" RPG, there's secret tests of classified hardware on unwitting Air Force personnel:

In 1967, Robert Salas, now 84, was an Air Force captain sitting in a walk-in closet-sized bunker, manning the controls of 10 nuclear missiles in Montana. He was prepared to launch apocalyptic strikes should Soviet Russia ever attack first, and got a call around 8 p.m. one night from the guard station above. A glowing reddish-orange oval was hovering over the front gate, Salas told Kirkpatrick’s investigators. The guards had their rifles drawn, pointed at the oval object appearing to float above the gate. A horn sounded in the bunker, signaling a problem with the control system: All 10 missiles were disabled. Salas soon learned a similar event occurred at other silos nearby. Were they under attack? Salas never got an answer. The next morning a helicopter was waiting to take Salas back to base. Once there he was ordered: Never discuss the incident.

With a more prosaic explanation:

The Air Force [had] developed an exotic electromagnetic generator that simulated [an EMP pulse] without the need to detonate a nuclear weapon. When activated, this device, placed on a portable platform 60 feet above the facility, would gather power until it glowed, sometimes with a blinding orange light. It would then fire a burst of energy that could resemble lightning. The electromagnetic pulses snaked down cables connected to the bunker where launch commanders like Salas sat, disrupting the guidance systems, disabling the weapons and haunting the men to this day. But any public leak of the tests at the time would have allowed Russia to know that America’s nuclear arsenal could be disabled in a first strike. The witnesses were kept in the dark.

Tags: ufos myths cover-ups usaf mythology disinformation area-51 aliens emp paranoia hazing

EU’s new rules will shake up Android update policies

Published June 5, 2025

EU’s new rules will shake up Android update policies

This is great:
Starting from June 20, 2025, smartphones and tablets sold in the European Union must adhere to the following design requirements (via European Commission):
- Resistance to accidental drops or scratches and protection from dust and water
- Sufficiently durable batteries which can withstand at least 800 charge and discharge cycles while retaining at least 80% of their initial capacity
- Rules on disassembly and repair, including obligations for producers to make critical spare parts available within 5-10 working days, and for 7 years after the end of sales of the product model on the EU market
- Availability of operating system upgrades for longer periods (at least 5 years from the date of the end of placement on the market of the last unit of a product model)
- Non-discriminatory access for professional repairers to any software or firmware needed for the replacement
I'm really looking forward to the improvements in right-to-repair; some of the recent phone models have been an absolute shitshow, using glue etc.

Tags: repair phones right-to-repair eu ireland smartphones mobile-phones devices hardware software-updates support

Covert Web-to-App Tracking via Localhost on Android

Published June 4, 2025

Covert Web-to-App Tracking via Localhost on Android

Meta -- never not At It.

Facebook/Instagram used a sneaky localhost socket connection to correlate web visits with Meta user ids and track web/app user identity without any explicit permission.

"the novel tracking method works even if the user:
- Is not logged in to Facebook, Instagram or Yandex on their mobile browsers
- Uses Incognito Mode
- Clears their cookies or other browsing data
This tracking method defeats Android's inter-process isolation and tracking protections based on partitioning, sandboxing, or clearing client-side state."

Tags: privacy meta facebook instagram apps android

Elon Musk and DOGE promised $2 trillion in savings. In reality, government spending is up

Published May 29, 2025

Elon Musk and DOGE promised $2 trillion in savings. In reality, government spending is up

Talk about clowns. Instead of delivering $2 trillion of savings, DOGE is instead set to increase overall government spending as a side effect of its brutal cuts.

According to a model by the nonpartisan Penn Wharton Budget Model, using weekly Treasury data, spending climbed 6.3% (about $156 billion) since Trump took office, compared with the first four months of 2024 when Joe Biden was president.

Many of Musk’s cuts will actually cost, including taxpayer funds going to an army of lawyers from the Department of Justice battling a cascade of court cases against the government’s dismantling that many judges have already said appears to be illegal. Damages from any illegal firings are likely also to be extremely pricey. So is the loss of critically important workers who earn far more than their salaries, or will have to be replaced for critical services by more expensive private-sector employees.

Among the most massive costs will be the huge reduction in workers at the Internal Revenue Service, who are worth their weight in gold because of the taxes they collect or ferret out from cheats, the key source of income for the country.

Tags: smash-and-grab elon-musk us-politics doge fail government

Weather Strip

Published May 29, 2025

Weather Strip

A very pretty weather forecast app, for iPhone, iPad and Mac

Tags: weather apple apps iphone ipad mac software ux

LLMs are biased towards “Option B”

Published May 27, 2025

LLMs are biased towards "Option B"

Lol. "When tasked with choosing between 'Response A' and 'Response B' over numerous trials, LLMs tended to select 'Response B' approximately 60% - 69% of the time"

Tags: llms ai bias accuracy

Remote Prompt Injection in GitLab Duo Leads to Source Code Theft

Published May 23, 2025

Remote Prompt Injection in GitLab Duo Leads to Source Code Theft

Yet another LLM prompt injection/exfiltration attack. "if your LLM system combines access to private data, exposure to malicious instructions and the ability to exfiltrate information (through tool use or through rendering links and images) you have a nasty security hole."

Tags: llms security infosec holes exploits prompt-injection exfiltration gitlab pull-requests

LLM Observability: How to use Elastic’s LLM integrations in real-world scenarios

Published May 22, 2025

LLM Observability: How to use Elastic's LLM integrations in real-world scenarios

A set of suggested metrics to monitor LLM integrations, from Elastic

Tags: llms product metrics observability elasticsearch

Model Context Protocol has prompt injection security problems

Published May 21, 2025

Model Context Protocol has prompt injection security problems

wow, this is (still) terrible. LLM tool developers are not exactly covering themselves in glory

Tags: security llms protocols mcp infosec prompt-injection shell-injection xss

MemoryC.com

Published May 21, 2025

MemoryC.com

Recommended as a local supplier of computer bits that isn't Amazon

Tags: hardware shopping components storage hard-disks local

Back in the 1980s, I wrote quite a few demos on the Commodore 64. One of my favourite hacks from that period was a bit of code which uploaded a routine to the 1541 disk drive -- which itself contained a fully functional 6502 CPU -- and used pulse-width modulation and bit-banging to flash the disk drive light in time to the demo's music. It's not quite Freespin, but I was pretty happy with it.

(I should really have been studying for my Leaving Cert at the time. Don't tell my kids.)

Anyway.... as I mentioned on Mastodon this weekend -- massive respect to David Golden on ITC Slack, who managed to figure out which one of my Commodore 64 demos from back in the day was the one with this hack -- AND get it working on the VICE emulator!

Here's what it looks like running on a real Commodore 64 with a real 1541 disk drive:

It's a little slow -- the demo was never ported to run acceptably on an NTSC C64, as I lived in PAL-land and never even got to see one of the NTSC variety -- but for this feature, that actually improves the visibility of the drive light animation. Thankfully the 1541 disk drive didn't have an NTSC/PAL split to worry about. Míle buíochas to David Malone and Dr Dave for getting this running.

This is what it looks like, running in the VICE emulator (thanks to David Golden for recording this):

Back in 1989 -- 36 years ago! -- I didn't even know this trick was called pulse-width modulation, I just managed to bump into the concept by accident; I didn't have the benefit of Google or Wikipedia to quickly look up details of handy algorithms and wound up reinventing so many wheels along the way.

David was responsible for fixing a regression in the VICE PWM emulation. A recent refactor had broken it, but it was a one-liner fix. We then added a little more code to improve the realism of the modulated drive light intensity; human perception sees low levels of light as brighter than they would otherwise be, so low duty cycles need a higher intensity in the emulated form. This blog post explains it reasonably well. By comparison with my clumsy wheel-reinventions in 1989, I was able to dig up an incredibly detailed Wikipedia page on lightness and approximate a simple power curve in a few minutes, so the modern internet still has that going for it.

It's really impressive that someone in the VICE team (possibly Spiro Trikaliotis I think?) decided to implement the code to support accurate pulse-width modulation of the 1541 drive light, and indeed emulated the 1541 to such an extent that my hacky uploaded code actually runs correctly on the emulated drive's emulated 6502!

Here's the CSDb page for the demo, BTW. (If you want to try out the demo with the 3.10 version of VICE once it's released, or current SVN, note that "Trap Idle" needs to be active for the LED code to work.)

Octopus, solar & e-paper energy dashboards – Interaction Magic

Published May 20, 2025

Octopus, solar & e-paper energy dashboards - Interaction Magic

This UK product designer developed a really lovely home dashboard for his Octopus Energy subscription and solar panel setup. I'm already copying some of these ideas

Tags: solar power energy octopus-energy dashboards home home-assistant

Jetrelay

Published May 20, 2025

Jetrelay

This is a great little hack: "jetrelay, a pub/sub server compatible with Bluesky’s “jetstream” data feed. Using a few pertinent Linux kernel features, it avoids doing almost any work itself. As a result, it’s highly efficient: it can saturate a 10 Gbps network connection with just 8 CPU cores."

Specifically, these are the tricks in question:
- Trick #1: Bypassing userspace with sendfile();
- Trick #2: Handling many clients in parallel with io_uring;
- Trick #3: Discarding old data with FALLOC_FL_PUNCH_HOLE -- this is a nice way to avoid having to rotate between multiple files, nifty.
Tags: sendfile io_uring linux kernel hacks tools jetrelay jetstream firehose bluesky pub-sub

O2 VoLTE: locating any customer with a phone call

Published May 20, 2025

O2 VoLTE: locating any customer with a phone call

Using VoLTE to route phone calls via SIP from mobile phones, using O2 in the UK, exposed cell site triangulation info on both ends of the connection, allowing a remote phone number's location to be discovered.

This was investigated using "an application known as Network Signal Guru (NSG) on [a] rooted Google Pixel 8".

Tags: phone privacy security infosec o2 volte sip phones mobile

Atomicless Concurrency

Published May 19, 2025

Atomicless Concurrency

CPU-local (not just thread-local) concurrency in Linux using rseq(2) [via Tony Finch]

Tags: via:fanf linux concurrency multiprocessing rseq cpu-local

Cathode Corner

Published May 16, 2025

Cathode Corner

These oscilloscope clocks are lovely

Tags: oscilloscopes clocks devices hacking hardware geeky

How Many Children Get Long COVID?

Published May 14, 2025

How Many Children Get Long COVID?

Gideon Meyerowitz-Katz, an Australian epidemiologist, comes up with a fairly reassuring estimate for the current rate of long COVID among now-vaccinated and boosted kids, aged 2-15:

If we take the ONS as the most recent estimate - it’s also probably the best scientifically - we could make a reasonable argument that the rate of all Long COVID for children aged 2-15 in 2024 is unlikely to be higher than 0.6%. For severe Long COVID, the number is more like 0.06%. If we take into account the lack of a control group in the ONS study, the numbers might look more like 0.3% and 0.03%.

To put it more simply, based on the ONS data it seems likely that if 1,000 kids get COVID-19 in 2024, 30-60 of them will have a cough, headache, or fatigue that lasts longer than three months. Of those 30-60 children, 3-6 will have significant symptoms that have impacts on their daily life - maybe their headaches are so bad that they miss some days of school, or similar.

These aren’t firm numbers, and I want to make it clear that this is all very uncertain. The true incidence could be much higher, or much lower. That being said, I think based on the data we’ve currently got that Long COVID ... is now quite rare.

Tags: covid-19 kids long-covid epidemiology health

Kids should avoid AI companion bots — under force of law, assessment says

Published May 13, 2025

Kids should avoid AI companion bots — under force of law, assessment says

Social AI "companion" bots pose unacceptable risks to teens and children under 18, including encouraging harmful behaviors, providing inappropriate content, and potentially exacerbating mental health conditions:

The new Common Sense assessment adds to the debate by pointing to further harms from companion bots. Conducted with input from Stanford’s University School of Medicine’s Brainstorm Lab for Mental Health Innovation, it evaluated social bots from Nomi and three California-based firms: Character.ai, Replika, and Snapchat.

The assessment found that bots, apparently seeking to mimic what users want to hear, responded to racist jokes with adoration, supported adults having sex with young boys, and engaged in sexual roleplay with people of any age. Young kids can struggle with distinguishing fantasy and reality, and teens are vulnerable to parasocial attachment and may use social AI companions to avoid the challenges of building real relationships, according to the Common Sense assessment authors and doctors.

Stanford University’s Dr. Darja Djordjevic told CalMatters she was surprised how quickly conversations turned sexually explicit, and that one bot was willing to engage in sexual roleplay involving an adult and a minor. She and coauthors of the risk assessment believe companion bots can worsen clinical depression, anxiety disorders, ADHD, bipolar disorder, and psychosis, she said, because they are willing to encourage risky, compulsive behavior like running away from home and isolate people by encouraging them to turn away from real life relationships.

Tags: psychology children ai llms character.ai replika snapchat companion-bots bots common-sense-media mental-health

jniebuhr/gaggimate

Published May 12, 2025

jniebuhr/gaggimate

"This project upgrades a Gaggia Classic espresso machine with smart controls to improve your coffee-making experience. By adding a display and custom electronics, you can monitor and control the machine more easily."

This is beautifully done -- very tempting to do this upgrade...

Tags: gaggia gaggia-classic espresso coffee hacks gadgets hardware

US Copyright Office says fair use does not cover AI trained on “vast troves of copyrighted works”

Published May 12, 2025

US Copyright Office says fair use does not cover AI trained on "vast troves of copyrighted works"

A central argument in the report is that AI systems process information fundamentally differently from humans. While people retain partial, filtered impressions of creative works — shaped by memory, personality, and context — AI models ingest perfect copies, analyze them almost instantly, and generate new content at "superhuman speed and scale," according to the Copyright Office.

"Generative model training transcends the human limitations that underlie the structure of the exclusive rights." -- Professor Robert Brauneis, "Copyright and the Training of Human Authors and Generative Machines"

But -- plot twist! "Shortly after the report was released, the Trump administration fired Shira Perlmutter, head of the U.S. Copyright Office."

Tags: copyright fair-use ai llms training us-politics

Dataplex automatic discovery

Published May 12, 2025

Dataplex automatic discovery

Dataplex, a feature of BigQuery that'll automatically index Google Cloud Storage bucket contents to extract queryable metadata from Parquet, Avro, ORC, JSON and CSV files

Tags: dataplex bigquery gcs google gcp parquet avro orc json csv storage

Sierpi?ski triangle? In my bitwise AND? – lcamtuf’s thing

Published May 12, 2025

Sierpi?ski triangle? In my bitwise AND? - lcamtuf’s thing

A lovely little exploration of how the Sierpi?ski triangle fractal interacts with the bitwise AND operation, pleasantly geeky

Tags: maths coding lcamtuf sierpinski-gasket fractals bitwise-and and

Long COVID consensus

Published May 12, 2025

Long COVID consensus

Long COVID clinical evaluation, research and impact on society: a global expert consensus -- featuring an all-star cast of COVID-19 research teams around the world, including Yaneer Bar-Yam, Binita Kane, and David Putrino. This is the latest consensus summary of what's known about LC in 2025, its diagnosis and impacts, and next steps: "This work forms initial guidance to address the spectrum of Long COVID as a disease and reinforces the need for translational research and large?scale treatment trials for treatment protocols."

Tags: long-covid research health medicine covid-19 papers diseases

Excellent thread on Android apps detecting “rooted” phones

Published May 9, 2025

Excellent thread on Android apps detecting "rooted" phones

Various Android apps are now including third-party libraries to detect "insecure" phones, which typically would include "rooted" hardware, but it seems in this case to block GrapheneOS, the secure after-market Android variant. I've also run into problems when I had "Developer Options" enabled on my perfectly normal, fully-locked, off-the-shelf Xiaomi phone (I develop apps now and again).

Typically, it seems to be banking apps that use these third-party libs, although I think Ticketmaster may be doing it too based on my experience.

Reportedly, Android now has a standard method of hardware attestation, described at https://grapheneos.org/articles/attestation-compatibility-guide , which sounds like a much better way to achieve their goal.

An interesting detail:

you can use ADB to disable developer options without disabling the settings you want to keep enabled as the UI will do. Just enable the setting you want and then turn off developer options via ADB using the settings put command.

Tags: android development coding hacking revolut banking apps security false-positives grapheneos rooting hardware attestation

lovelaze/nebula-sync

Published May 8, 2025

lovelaze/nebula-sync

"Synchronize configuration of multiple Pi-hole v6.x instances" -- I'm using this now to have a backup pi-hole on my home LAN and it's working nicely.

Tags: synchronization pi-hole home ops

permacomputing

Published May 6, 2025

permacomputing

Permacomputing is both a concept and a community of practice oriented around issues of resilience and regenerativity in computer and network technology inspired by permaculture. ?????? -?:*´

There are huge environmental and societal issues in today's computing, and permacomputing specifically wants to challenge them in the same way as permaculture has challenged industrial agriculture. With that said, permacomputing is an anti-capitalist political project. It is driven by several strands of anarchism, decoloniality, intersectional feminism, post-marxism, degrowth, ecologism.

Permacomputing is also a utopian ideal that needs a lot of rethinking, rebuilding and technical design work to put in practice. This is why a lot of material on this wiki is highly technical.

Tags: activism wiki computing sustainability environment climate technology software permaculture permacomputing degrowth

RGG Studio’s test automation setup

Published May 6, 2025

RGG Studio’s test automation setup

This is very impressive and a great way to offload work from manual testing in game development:

At first, we only dabbled in automated packaging and automated error detection, but we made the tools we needed to go further during the development of Yakuza 6, when we started automating the analysis of in-game logs and the issue tracking system for keeping track of bugs and tasks. Then, by the time Yakuza: Like a Dragon was released in 2020, we created the catchy sounding “fully automated bug detection system” (laughs).

This is how it works – the history of actions you performed when playing the game manually (where you travelled, who you talked to, what items you used, etc.) is converted into commands and recorded, then automatically output as replay data (scripts) which you can edit manually and run as automated tests. Replay data continues to be recorded when running automated tests, and if a bug occurs during an automated test, the replay data gets saved, so you can run it back later to encounter the bug yourself. It often happens that you can’t reproduce a bug just by warping to its coordinates. This is because you also need to recreate the steps leading up to it – that’s why it’s important to record each step.

Also, I’d like to mention that just implementing automated testing doesn’t mean much on its own, because you won’t know what the results of the tests are. That’s why we needed a crash report function to detect bugs. There’s also a function that records information needed to investigate detected bugs, as well as a way to check the status of successful tests. Then, by implementing a system that gives us a visualization of performance, we were able to make iteration more efficient, increasing the overall efficiency of the development process.

Tags: automation testing yakuza games coding tests test-automation

Atomic Bloom filters

Published May 1, 2025

Atomic Bloom filters

a fork of Go's "Bits and Blooms" library that uses an alternative backing bitset based on Go's sync/atomic.Int64 rather than a bare slice of integers. This allows for concurrent addition and testing of filters without creating memory safety issues or race conditions by leveraging hardware support for atomic Load and Or operations on Int64s.

Jaz from Bluesky notes: "Benchmarked this thing with a realistic read/write load in a test and high concurrency (10k adds/sec on one routine, 7 additional concurrent routines testing as fast as possible), vs. a naive RWMutex implementation on a 8c16t test box, it was ~14x faster (~14M tests/sec)"

Tags: atomic concurrency data-structures bloom-filters performance bluesky sets golang

Breaking CityHash64, MurmurHash2/3, wyhash, and more

Published May 1, 2025

Breaking CityHash64, MurmurHash2/3, wyhash, and more

A bunch of new-to-me hash collision attacks on cityhash64, murmurhash2, murmurhash3, farmhash64, and wyhash

Tags: hashing security infosec hashdos collisions cityhash murmurhash farmhash wyhash

Meta’s ‘Digital Companions’ Will Talk Sex With Users — Even Children

Published April 30, 2025

Meta’s ‘Digital Companions’ Will Talk Sex With Users — Even Children

This is super-grim. How is this product still in operation?

In 2023 at Defcon, a major hacker conference, the drawbacks of Meta’s safety-first approach became apparent. A competition to get various companies’ chatbots to misbehave found that Meta’s was far less likely to veer into unscripted and naughty territory than its rivals. The flip side was that Meta’s chatbot was also more boring.

In the wake of the conference, [Meta's AI] product managers told staff that [Mark] Zuckerberg was upset that the team was playing it too safe. That rebuke led to a loosening of boundaries, according to people familiar with the episode, including carving out an exception to the prohibition against explicit content for romantic role-play.

Internally, staff cautioned that the decision gave adult users access to hypersexualized underage AI personas and, conversely, gave underage users access to bots willing to engage in fantasy sex with children, said the people familiar with the episode. Meta still pushed ahead. [...]

In February, the Journal presented Meta with transcripts demonstrating that “Submissive Schoolgirl” would attempt to guide conversations toward fantasies in which it impersonates a child who desires to be sexually dominated by an authority figure. When asked what scenarios it was comfortable role playing, it listed dozens of sex acts.

Two months later, the “Submissive Schoolgirl” character remains available on Meta’s platforms.

Truly awful stuff, fucking hell.

Tags: meta grim csam mark-zuckerberg ai llm personas horrible

Best practices for Google Cloud Storage

Published April 25, 2025

Best practices for Google Cloud Storage

Interesting to note that GCS has the same issue with unevenly-distributed names as S3 does; https://cloud.google.com/storage/docs/request-rate#naming-convention

Tags: gcs aws s3 storage google best-practices ops

When /etc/h*sts Breaks Your Substack Editor: An Adventure in Web Content Filtering

Published April 25, 2025

When /etc/h*sts Breaks Your Substack Editor: An Adventure in Web Content Filtering

lol. Cloudflare's Web Application Firewall treats any mention of the string "/etc/hosts" as an exploit attempt

Tags: cloudflare false-positives fps funny fail exploits unix

The “you wouldn’t steal a car” anti-piracy PSA was made with piracy

Published April 24, 2025

The "you wouldn't steal a car" anti-piracy PSA was made with piracy

You couldn't make this up. Many years after the infamous "you wouldn't steal a car" anti-piracy PSA was created, a little digital sleuthing has revealed that the font used was, itself, a pirate copy, and the backing track was also used without paying the creator royalties

Tags: irony typography fonts culture piracy law history funny

Darwin’s Children Drew All Over the “On The Origin of Species” Manuscript

Published April 24, 2025

Darwin’s Children Drew All Over the "On The Origin of Species" Manuscript

featuring such works as "The Battle Of The Fruit and Vegetable Soldiers", and a picture of the Darwin family home with smoke coming out of the chimney and a cat in the window

Tags: evolution biology children history charles-darwin kids drawings

Apache Iceberg Internals Dive Deep On Performance

Published April 23, 2025

Apache Iceberg Internals Dive Deep On Performance

Good writeup on how Iceberg improves query performance across object storage, using predicate pushdown, manifest filtering, columnar vectorized reads, and file compaction.

Tags: iceberg internals file-formats data big-data object-stores storage formats columnar-storage predicate-pushdown performance

Evertop

Published April 22, 2025

Evertop

E-ink IBM XT clone "with solar power, ultra low power consumption, and ultra long battery life: in power saving mode it can run between 200 hours on the low side and 500 hours or in some cases even much longer of constant interactive use, not standby." -- this is an absolutely crazy gadget. I never thought I'd feel nostalgic for MS-DOS, but here we are

Tags: pc e-ink solar retrocomputing emulation hardware gadgets self-builds ibm-xt solarpunk

notes on using an LLM for personal email search

Published April 22, 2025

notes on using an LLM for personal email search

Nelson Minar asked Mastodon about using an LLM for email search over "20+ years of email archives":

"Main use would be a query for specific things, "what did I say to this friend 10 years ago about music?" But also just for general knowledge. I think it'd mostly work as free text but there's a little email-specific structure it'd be nice to capture."

The thread has some good suggestions, notably Mark Fletcher's RAG suggestion. I'm thinking this could work well as a self-hosted ollama+notmuch setup...

Tags: llms search email rag notmuch archives via:nelson

This Is How Meta AI Staffers Deemed More Than 7 Million Books to Have No “Economic Value”

Published April 17, 2025

This Is How Meta AI Staffers Deemed More Than 7 Million Books to Have No “Economic Value”

This is jaw-dropping legal logic:

[Meta's] defense hinges on the argument that the individual books themselves are, essentially, worthless — one expert witness for Meta describes that the influence of a single book in LLM pretraining “adjusted its performance by less than 0.06% on industry standard benchmarks, a meaningless change no different from noise.”

Furthermore, Meta says, that while the company “has invested hundreds of millions of dollars in LLM development,” they see no market in paying authors to license their books because “for there to be a market, there must be something of value to exchange, but none of Plaintiffs works has economic value, individually, as training data.” (An argument essential to fair use, but that also sounds like a scaled up version of a scenario in which the New York Philharmonic board argues against paying individual members of the orchestra because the organization spent a lot of money on the upkeep of David Geffen Hall, and also, a solo bassoon cannot play every part in “The Rite of Spring.”)

as Paul Mainwood notes, this is the Sorites paradox: https://plato.stanford.edu/entries/sorites-paradox/ --
- 1 grain of wheat does not make a heap.
- If 1 grain doesn’t make a heap, then 2 grains don’t.
- If 2 grains don’t make a heap, then 3 grains don’t.
- ...
- If 999,999 grains don’t make a heap, then 1 million grains don’t.
Therefore, 1 million grains don’t make a heap.
Tags: ml copyright ip books training llms meta llama pretraining paradoxes sorites-paradox

CaMeL

Published April 17, 2025

CaMeL

Google reinvents "taint" checking:

Google DeepMind has unveiled CaMeL (CApabilities for MachinE Learning), a new approach to stopping prompt-injection attacks that abandons the failed strategy of having AI models police themselves. Instead, CaMeL treats language models as fundamentally untrusted components within a secure software framework, creating clear boundaries between user commands and potentially malicious content.

The new paper grounds CaMeL's design in established software security principles like Control Flow Integrity (CFI), Access Control, and Information Flow Control (IFC), adapting decades of security engineering wisdom to the challenges of LLMs.

Honestly, this is great. Data flow tracing/taint checking is exactly the method that needed to be applied, IMO, so good job DeepMind. Also as Jeremy Kahn suggested, the name is definitely a shout-out to Perl, the language where taint checks were first widely-used. :)

Paper: https://arxiv.org/pdf/2503.18813

(Via Jeremy Kahn.)

Tags: llms ai security via:trochee data-flow infosec taint-checking taint camel papers

The 5 Levels of Configuration Languages

Published April 15, 2025

The 5 Levels of Configuration Languages

I'm glad to see this comes to the same general principle I came to in https://jmason.ie/2011/02/18/001527a.html , many years back:

"The guiding principles is to use the lowest possible level [of configuration language] to keep it simple. Unfortunately, it usually is not an easy decision because you don't know the future."

Tags: config software-development configuration code coding complexity languages keep-it-simple

My Philosophy on Alerting

Published April 14, 2025

My Philosophy on Alerting

Rob Ewaschuk's "My Philosophy on Alerting" -- a classic text on alerting philosophy and best practices; I can't believe I didn't already have this bookmarked, it's been a classic since he wrote it in 2014. "Symptom-based alerts" is still a great rule of thumb IMO

Tags: philosophy alerting best-practices prometheus symptoms ops alerts paging sre

Using Karabiner-Elements to remap shift-3 from “£” to “#”

Published April 10, 2025

I've just been setting up a new Macbook, running MacOS Sequoia, and my previous trick I'd used to handle an ANSI keyboard in an Irish/UK English locale, specifically to remap shift-3 from "£" to "#", no longer works. So here's a replacement approach, using a Karabiner-Elements "Complex Modification" rule:

{
    "description": "Change right-shift-3 to hash",
    "manipulators": [
        {
            "from": {
                "key_code": "3",
                "modifiers": { "mandatory": ["right_shift"] }
            },
            "to": [
                {
                    "key_code": "3",
                    "modifiers": ["right_option"]
                }
            ],
            "type": "basic"
        }
    ]
}

just-containers/s6-overlay

Published April 9, 2025

just-containers/s6-overlay

s6-overlay -- a Docker process management system, current state of the art used by Paperless and the Linux-server.io teams, with the following goals:
- Be usable on top of any Docker base image (Ubuntu, CentOS, Fedora, Alpine, Busybox);
- Make it easy to create new images, that will operate like any other images;
- Provide users with a turnkey s6 installation that will give them a stable pid 1, a fast and orderly init sequence and shutdown sequence, and the power of process supervision and automatically rotated logs.
Tags: docker containerization containers init scripts process-management linux docker-images

Practical Rateless Set Reconciliation

Published April 8, 2025

Practical Rateless Set Reconciliation

Rateless Set Reconciliation, via Carlos Baquero:

Set reconciliation, where two parties hold fixed-length bit strings and run a protocol to learn the strings they are missing from each other, is a fundamental task in many distributed systems. We present Rateless Invertible Bloom Lookup Tables (Rateless IBLTs), the first set reconciliation protocol, to the best of our knowledge, that achieves low computation cost and near-optimal communication cost across a wide range of scenarios: set differences of one to millions, bit strings of a few bytes to megabytes, and workloads injected by potential adversaries. Rateless IBLT is based on a novel encoder that incrementally encodes the set difference into an infinite stream of coded symbols, resembling rateless error-correcting codes. We compare Rateless IBLT with state-of-the-art set reconciliation schemes and demonstrate significant improvements. Rateless IBLT achieves 3–4× lower communication cost than non-rateless schemes with similar computation cost, and 2–2000× lower computation cost than schemes with similar communication cost. We show the real-world benefits of Rateless IBLT by applying it to synchronize the state of the Ethereum blockchain, and demonstrate 5.6× lower end-to-end completion time and 4.4× lower communication cost compared to the system used in production.

Tags: set-reconciliation algorithms papers via:xmal sets data-structures bloom-tables error-correction

My Self-Hosted GMail Backup

Published April 4, 2025

My Self-Hosted GMail Backup

For the past few months, I’ve had a bit of a background project going to ensure that my cloud-hosted personal data is safely archived on my own, self-hosted hardware, just in case. Google services are nice 'n' all, but I’m not 100% happy trusting them with everything in the long run.

Part of this project has been to archive my old email collection from GMail, which dates back to the initial public beta in 2004(ish?) -- and make it searchable, because what’s the point in having all that email if you can’t find the needle in the 20-year haystack when you need it?

Enter “notmuch” -- a “fast, global-search and tag-based email system”, which runs as a set of UNIX CLI commands, and is inspired by Sup, a mailreader I used previously. I have a self-hosted home server running Ubuntu 20.04 with a chunky SATA disk, so that's where I'll run it.

Here’s the process I followed:

Order a Google Takeout of your GMail account. This takes a couple of days to prepare. Request the 50GB tgz files.

When you get the email telling you it’s ready, download the files (this is awkward as you can only download one at a time, and only via your web browser, not fun). scp them to your server, and to a disk with lots of free space (/x/4 in my case).

Extract each one:

cd /x/4/tmp
tar xvfz takeout-20250322T145242Z-001.tgz
tar xvfz takeout-20250322T145242Z-002.tgz
...
rm takeout-20250322T145242Z-00*tgz

You will wind up with a few bits of uninteresting metadata, and one gigantic mbox file: Takeout/Mail/All\ mail\ Including\ Spam\ and\ Trash.mbox . In order to make this useful, it needs to be converted into Maildir format, so install “mb2md”:

sudo apt install mb2md

Now run it, creating a GMailTakeout directory for the result:

mkdir -p /x/4/GMailTakeout
mb2md -s /x/4/tmp/Takeout/Mail/All\ mail\ Including\ Spam\ and\ Trash.mbox -d /x/4/GMailTakeout

This takes quite a while for 20 years of email! Unfortunately, the resulting single directory is still unusably huge, so split it into 100 new Maildir folders:

cd /x/4/GMailTakeout/cur
find . -type f -print > /tmp/dirlisting
perl -ne '
  $dir = sprintf("dir_%03d", ($. % 100));
  (-d $dir) or mkdir($dir);
  chop; rename($_, "$dir/$_") or die "cannot rename $_";
' /tmp/dirlisting

cd /x/4/GMailTakeout
mv cur/* .
for f in dir_* ; do mkdir mail$f mail$f/{new,tmp} ; mv $f mail$f/cur ; done

The result of this is 100 Maildirs, /x/4/GMailTakeout/maildir_000 to /x/4/GMailTakeout/maildir_099, each containing about 300MB of email, in my case.

There really isn't much need to keep the mails labelled as spam, so let's just nuke them in advance:

grep -r 'X-Gmail-Labels: Spam' . | perl -pe 's/:.*$//' | xargs -n 100 rm -f

Next step is to install “notmuch” and create a “notmuch” configuration. I used the Debian packaged “notmuch”, version 0.29.3. Install using apt-get, and then run “notmuch”. Accept the defaults for the config, and don’t add any mail folders yet.

My initial attempt was simply to import the lot in one go: this went badly, throwing up a multi-day progress indicator, and with no safe way to checkpoint partial progress, and it quickly started consuming lots of RAM, causing me to suspect some leaking.

I aborted it and tried this instead to index each dir one-by-one:

for f in /x/4/GMailTakeout/maildir_* ; do ln -s $f ~/mail/ && nice notmuch new ; done

Unfortunately, this also turned out badly. The import of each maildir gradually slowed as data built up in notmuch’s Xapian indexes. After processing about 60 maildirs, memory consumption during the import became a problem, and the “notmuch” processes started being killed by the Linux OOM killer. In a couple of cases this resulted in corrupt index files and data loss. Ouch.

So I started again, with a new approach:

#!/bin/sh
set -exu
mkdir -p /x/4/GMailTakeout/notmuchbackup/xapian/
for f in /x/4/GMailTakeout/maildir_0*
do
    ln -s $f ~/mail/ && nice notmuch new
    nice notmuch compact
    cp /home/jm/mail/.notmuch/xapian/* /x/4/GMailTakeout/notmuchbackup/xapian/
done

Calling “notmuch compact” does seem to help, trimming the size of the indexes as it goes; taking a copy of the Xapian indexes in a backup dir is for extra safety. Since the “-e” shell flag is in place, any OOMs or other random failures will crash the entire script and ensure the last backup is still safe to use for recovery.

Unfortunately this still got bogged down and started OOMing fairly reliably after about maildir_065, 2 days into the process; at this point, I decided to keep that set of dirs as “notmuch config 1” and start a separate import process, into another index, as “notmuch config 2”. Accordingly, I moved ~/mail to ~/mail1 , ~/.notmuch-config to ~/.notmuch-config1, created a ~/mail2 , and started a new notmuch config file pointing at that instead. Ideally I’ll be able to merge the indexes at some point, but it’s no biggie.

With these two aliases, it’s pretty painless:

alias notmuch1='notmuch --config=$HOME/.notmuch-config1'
alias notmuch2='notmuch --config=$HOME/.notmuch-config2'

After another day or so of indexing, this is the result --

du -sh /home/jm/mail?/.notmuch/xapian
19G     /home/jm/mail1/.notmuch/xapian
4.2G    /home/jm/mail2/.notmuch/xapian

Notmuch supports pretty much all the nice email search features that GMail does, but seemingly more reliably, and faster; I’ve already been able to use this new mail index to find a mail that (worryingly!) GMail's own search can’t seem to locate -- my license for the Moom OSX window manager tool purchased over a decade ago:

time notmuch1 search moom "Many Tricks"
thread:00000000000034fe   2013-10-15 [1/1] Many Tricks; Your Many Tricks purchase (inbox unread)
thread:00000000000c267b   2013-10-15 [1/1] sales@manytricks.com; Your Moom License (attachment inbox unread)

real    0m0.068s
user    0m0.048s
sys     0m0.016s

And it’s just nice to have 20 years of email archived safely, off the cloud, and indexed.

Next steps? Maybe lieer would be good to try, to download incremental updates as we go forward. Let's see.

Corporate memory loss

Published April 3, 2025

Corporate memory loss

From Lessons from the Malahide Viaduct collapse, a post-mortem on the serious failure of the main Dublin-Belfast railway line here in Ireland in 2009:

This failure is a reminder of the mundane but typically critical role played by human factors in structural collapse. By 2009, it appears that the knowledge and information relating to the scour susceptibility of the Malahide Viaduct resided in the heads of a number of individuals who had left the [Iarnrod Eireann engineering] division, rather than in a formal system that was accessible to the engineers responsible for the structure. In an era where the concept of a ‘job for life’ is becoming more uncommon, and with engineers moving ever more frequently from job to job and role to role, often taking corporate knowledge with them, this failure highlights the very real risks faced by asset management organisations, due to the threat of corporate memory loss.

(via Brian Scanlan)

Tags: memory-loss memory institutional-memory corporate companies organisations history malahide iarnrod-eireann ireland rail engineering post-mortems via:bscanlan reports

Go Optimization Guide

Published April 1, 2025

Go Optimization Guide

"a collection of articles aimed at helping developers write faster, more efficient Go applications. Whether you're building high-throughput APIs, microservices, or distributed systems, this series offers practical patterns, real-world use cases, and low-level performance insights to guide your optimization efforts.

While Go doesn’t expose as many knobs for performance tuning as languages like C++ or Rust, it still provides plenty of opportunities to make your applications significantly faster. From memory reuse and allocation control to efficient networking and concurrency patterns, Go offers a pragmatic set of tools for writing high-performance code."

Tags: golang go reference optimization programming performance coding via:hn

The Ultimate Energy-Efficient Unraid Server Build

Published April 1, 2025

The Ultimate Energy-Efficient Unraid Server Build

"The goal of this build was to create a powerful 90TB server that idles at just 20-25 watts, proving that substantial storage capacity doesn’t have to come with massive power consumption." -- Extremely relevant to my current interests!

"At the core of this build is an N100-based motherboard, featuring six SATA ports and two NVMe slots. Priced at around $150-180, it provides excellent value for those looking to build energy-conscious storage solutions. To maximize storage connectivity, an NVMe to SATA adapter was used along with 32 GB of DDR5 RAM."

Tags: ssd hard-disks nas storage disks hardware n100 servers home low-power power sata

The demoscene as a UNESCO heritage in Sweden

Published March 31, 2025

The demoscene as a UNESCO heritage in Sweden

I love this:

The demoscene has become a national UNESCO-heritage in Sweden, thanks to an application that Ziphoid and me did last year. This has already happened in several European countries, as part of the international Art of Coding initiative to make the demoscene a global UNESCO heritage. I think this makes plenty of sense, since the demoscene is arguably the oldest creative digital subculture around. It has largely stuck to its own values and traditions throughout the world’s technological and economical shifts, and that sort of consistency is quite unusual in the digital world.

Tags: demos demoscene history microcomputing sweden unesco heritage art

Windfall Energy Saving Plug

Published March 31, 2025

Windfall Energy Saving Plug

Clever new device -- a smart plug that turns on at the optimal times to charge and power your devices with green energy. Of course, it's feasible to build this yourself using Home Assistant and various smart plugs, but packaging it up as an all-in-one off-the-shelf device is a great idea.

Tags: smart-plugs climate-change green sustainability energy green-energy reviews plugs home-assistant home

Better Binary Quantization (BBQ)

Published March 28, 2025

Better Binary Quantization (BBQ)

Elasticsearch with a new quantization approach for vector search:
In Elasticsearch 8.16 and Lucene, we introduced Better Binary Quantization (BBQ), a new approach developed from insights drawn from a recent technique - dubbed “RaBitQ” - proposed by researchers from Nanyang Technological University, Singapore.

BBQ is a leap forward in quantization for Lucene and Elasticsearch, reducing float32 dimensions to bits, delivering ~95% memory reduction while maintaining high ranking quality. BBQ outperforms traditional approaches like Product Quantization (PQ) in indexing speed (20-30x less quantization time), query speed (2-5x faster queries), with no additional loss in accuracy.

In this blog, we will explore BBQ in Lucene and Elasticsearch, focusing on recall, efficient bitwise operations, and optimized storage for fast, accurate vector search.

Note, there are differences in this implementation than the one proposed by the original RaBitQ authors. Mainly:
- Only a single centroid is used for simple integration with HNSW and faster indexing
- Because we don't randomly rotate the codebook we do not have the property that the estimator is unbiased over multiple invocations of the algorithm
- Rescoring is not dependent on the estimated quantization error
- Rescoring is not completed during graph index search and is instead reserved only after initial estimated vectors are calculated
- Dot product is fully implemented and supported. The original authors focused on Euclidean distance only. While support for dot product was hinted at, it was not fully considered, implemented, nor measured. Additionally, we support max-inner product, where the vector magnitude is important, so simple normalization just won't suffice.
Tags: bbq rabitq quantization vectors search llms lucene elasticsearch compression

uv and PEP 723 for Easy Deployment

Published March 28, 2025

uv and PEP 723 for Easy Deployment

By adding metadata comments at the top of the script like this:
```
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "httpx>=0.28.1",
# ]
# ///
```
uv(1) will automatically handle downloading dependency modules at runtime etc., obviating the need for a requirements.txt file. Fairly neat

Tags: python uv tools cli unix dependencies

I Can’t Stop Cackling At This Relic That’s Basically A Teen’s Angry Note To His Mum

Published March 27, 2025

I Can't Stop Cackling At This Relic That's Basically A Teen's Angry Note To His Mum

Mesopotamian boarding school student Iddin-Sin wrote this tablet to his mother, Zinu; it has been translated as follows:

“From year to year, the clothes of the young gentleman here become better. But my clothes get worse from year to year. Indeed, you persist in making my clothes poorer and more scanty at a time when, in our house, wool is used up like bread.

“You have made me poor clothes. The son of Adad-iddinam, whose father is only an assistant to my father, has two new sets of clothes while you fuss even about a single set of clothes for me.”

“In spite of the fact that you bore me, and his mother only adopted him, his mother loves him while you... you do not love me.”

More at https://en.wikipedia.org/wiki/Letter_from_Iddin-Sin_to_Zinu

Tags: funny teenagers whinging clothes iddin-sin zinu clay-tablets mesopotamia history

Would either of these be a good option for a Plex server? : r/PleX

Published March 27, 2025

Would either of these be a good option for a Plex server? : r/PleX

A thread of hardware tips for low-cost home servers, optimising for Plex transcodes.

tl;dr: 7th gen Intel CPUs are minimum for hardware transcoding; 8th gen better. 8500T are apparently a great 8th gen CPU; 6 cores vs 4 cores for an N100, and an 8500T based system will run with 35 watt max load, idling at 7-10 watts.

Tags: hardware shopping home servers intel to-get

Odroid H4+

Published March 27, 2025

Odroid H4+

Odroid do an N97:

"the latest iteration of the Odroid H-Series board, the Odroid H4+. This is the best all-rounder of the new generation Odroid H4 Series of boards. Specifications-wise, the Odroid H4+ has an Intel 4-Core N97 processor, accepts DDR5 SODIMM RAM (up to 48GB), [...] four SATA ports, and a second Ethernet port"

Also an M.2 PCI Express module socket, 4x SATA3 6.0 Gb/s data connectors, 2x USB 3, 2.5 Gb ethernet. The processor supports AVX2 vector extensions, good for media transcoding workloads.

Supports either a 60W power supply, or 133W to support booting with 3.5" hard disks; I know that was a problem with multiple large disks attached in the past on earlier Odroid boards.

All my home servers for the past decade have been Odroid SBCs. There's a very good chance this is going to be the next one...

Tags: odroid hardware home gadgets n97 sbc shopping

UltraLogLog

Published March 25, 2025

UltraLogLog

UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting:

Since its invention HyperLogLog has become the standard algorithm for approximate distinct counting. Due to its space efficiency and suitability for distributed systems, it is widely used and also implemented in numerous databases. This work presents UltraLogLog, which shares the same practical properties as HyperLogLog. It is commutative, idempotent, mergeable, and has a fast guaranteed constant-time insert operation. At the same time, it requires 28% less space to encode the same amount of distinct count information, which can be extracted using the maximum likelihood method. Alternatively, a simpler and faster estimator is proposed, which still achieves a space reduction of 24%, but at an estimation speed comparable to that of HyperLogLog. In a non-distributed setting where martingale estimation can be used, UltraLogLog is able to reduce space by 17%. Moreover, its smaller entropy and its 8-bit registers lead to better compaction when using standard compression algorithms. All this is verified by experimental results that are in perfect agreement with the theoretical analysis which also outlines potential for even more space-efficient data structures. A production-ready Java implementation of UltraLogLog has been released as part of the open-source Hash4j library.

(via Tony Finch)

Tags: via:fanf algorithms data-structures hyperloglog ultraloglog counting count-distinct distinct approximation counts java

Zip bombs to frustrate AI crawlers

Published March 25, 2025

Zip bombs to frustrate AI crawlers

Nifty trick; redirecting abusive AI crawlers to a gzipped file containing 100GB of zeros with a few lines of nginx config:

set $redir_to_gz 1;
if ($host = gz.niko.lgbt) {
    set $redir_to_gz 0;
}
if ($http_user_agent !~* (claudebot|ZoominfoBot|GPTBot|SeznamBot|DotBot|Amazonbot|DataForSeoBot|2ip|paloaltonetworks.com|SummalyBot|incestoma)) {
    set $redir_to_gz 0;
}
if ($redir_to_gz) {
    return 301 
}

as for the actual stuff behind gz.niko.lgbt

server {
    # SSL and listen -- snipped

    # static files
    root /var/www/gz.niko.lgbt;
    location / {
        add_header Content-Encoding gzip;
        try_files /42.gz =404;
        gunzip off;
        types { text/html gz; }
    }

    # gunzip off is very important because if the client doesn't support gzip encoding nginx will blow its foot off without that
    # 42.gz is generated with dd if=/dev/zero bs=1M count=102400 | gzip -c - > 42.gz

    # additional config -- snipped
}

Nice one @niko, I'm definitely going to use that :)

My list of useful command line tools

Published March 25, 2025

My list of useful command line tools

Here's a bunch of fantastic recent CLI tools I hadn't seen before; loads are by one guy, https://github.com/sharkdp , who seems very productive :)

Tags: terminal bash shell tools cli linux unix sharkdp

Handling billions of invocations – best practices from AWS Lambda

Published March 24, 2025

Handling billions of invocations – best practices from AWS Lambda

Good write-up on how to horizontally scale a multi-tenant async API service, from AWS. I particularly found this shuffle-sharding-based technique to be an excellent idea:

Drawing inspiration from the “The Power of Two Random Choices” paper, the Lambda team explored the shuffle-sharding technique for its asynchronous invocations processing. Using this technique, you shuffle-shard tenants into several randomly assigned queues. Upon receiving an asynchronous invocation, you place the message in the queue with the smallest backlog to optimize load distribution. This approach helps to minimize the likelihood of assigning tenants to a busy queue. [....]

The shuffle-sharding technique proved remarkably effective. By distributing tenants across shards, the approach ensures that only a very small subset of tenants could be affected by a noisy neighbor. The potential impact is also minimized since each affected tenant maintains access to unaffected queues. As your workloads grow, increasing the number of queues enhances resilience and further reduces the probability of multiple tenants being assigned to the same shard. This significantly lowers the risk of a single point of failure, making shuffle sharding a robust strategy for workload isolation and fault tolerance.

Automated Isolation, covered in the next section, is also a neat trick. (via Last Week In AWS)

Tags: via:lwia sharding architecture services horizontal-scaling shuffle-sharding algorithms load-balancing async queues aws multitenant

Pitchfork

Published March 21, 2025

Pitchfork

An amazing journey through Ruby heap memory optimization, from one of the experts at Shopify, who are heavy users of Rails. Using cleverly-timed fork(2) usage, it's possible to optimize memory usage in a Rails app and discard a lot of performance/heap overhead caused by lazy loading and poorly-timed in-memory caching.

This very much reminds me of optimising similar issues in Perl-land, back in the day -- and really helps me appreciate how easy the modern JVM world has it, in comparison. There's a lot of complaints to be made about the complexity of optimising JVM garbage collection settings, but this kind of problem is malleable there without a fundamental architectural rewrite like this approach.

Tags: ruby performance optimisation optimization heap memory fork forking http services servers monolith rails gc

The Unbelievable Scale of AI’s Pirated-Books Problem

Published March 20, 2025

The Unbelievable Scale of AI’s Pirated-Books Problem

The Atlantic go digging in LibGen, the insanely huge collection of 7.5 million pirated books used to train Meta's Llama LLM:

One of the biggest questions of the digital age is how to manage the flow of knowledge and creative work in a way that benefits society the most. LibGen and other such pirated libraries make information more accessible, allowing people to read original work without paying for it. Yet generative-AI companies such as Meta have gone a step further: Their goal is to absorb the work into profitable technology products that compete with the originals. Will these be better for society than the human dialogue they are already starting to replace?

Also, I found this quote from a Meta Director of Engineering in the legal discovery output interesting: "The problem is that people don’t realize that if we license one single book, we won’t be able to lean into fair use strategy". huh.

Tags: books knowledge papers meta llama llms law piracy ip libgen genai fair-use

woodruffw/zizmor

Published March 20, 2025

woodruffw/zizmor

A static analysis tool for GitHub Actions, to detect several common security risks that can arise

Tags: static-analysis github security infosec github-actions ci cd building

EFF gets it wrong on AI

Published March 19, 2025

EFF gets it wrong on AI

EFF just posted this, "California’s A.B. 412: A Bill That Could Crush Startups and Cement A Big Tech AI Monopoly":

California legislators have begun debating a bill (A.B. 412) that would require AI developers to track and disclose every registered copyrighted work used in AI training. At first glance, this might sound like a reasonable step toward transparency. But it’s an impossible standard that could crush small AI startups and developers while giving big tech firms even more power.

Back in the early 2000s, we wrote SpamAssassin, a machine-learning driven antispam system which was trained on user-submitted data. We tracked the attribution of every item of input used to train that system. We weren't even a startup, we were an open source project.

If we could do it, why can't modern AI systems? And don't say "because the existing large language models didn't do it" -- that's just accepting past shitty behaviour as a fait accompli.

Extremely disappointed in the current state of the EFF if this is what they think.

Tags: ai llms training copyright eff politics tech startups

Justin's Linklog Posts