Justin's Linklog

Largest Dataset Powering AI Images Removed After Discovery of Child Sexual Abuse Material

LAION training data (used by Stable Diffusion among others) proves to contain suspected CSAM and other horrors. This is 100% the problem with training sets derived from random scrapes of random web shite. There is doubtless buckets of illegal, abusive, and toxic content being trained on.

(tags: images llms generative-ai stable-diffusion laion training ml)

Archives