Skip to content

Archives

Misusing the BIG-Bench canary string

  • Misusing the BIG-Bench canary string

    Interesting; this blog post discusses using the BIG-Bench canary string, intended to keep data like accuracy test cases out of LLM training corpora, as a general-purpose “don’t scrape me” flag on personal blogs. This seems like a more practical, and more likely to be observed, way to opt out of AI training — seeing as the scrapers don’t seem to reliably honour any of the others

    (tags: blogging canaries opt-out scraping web ai llm openai chatgpt claude bing)