Bad Actors are Grooming LLMs to Produce Falsehoods
Measurements of the effectiveness of the "Pravda" disinformation network:
Even with [knowledge of LLM Grooming], ChatGPT nevertheless often repeats propaganda from Pravda. Model o3, OpenAI’s allegedly state of the art “reasoning” model still let Pravda content through 28.6% of the time in response to specific prompts, and 4o cited Pravda content in five out of seven (71.4%) times. In an ideal world, AI would be smart enough to cut off falsehoods at the pass, reasoning from known facts, in order to rule out nonsense.
Tags: pravda disinformation russia propaganda llm ai training