↓Skip to main content

🏢 Anthropic

Many-shot Jailbreaking

26 September 2024·5721 words·27 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Anthropic

Long-context attacks easily manipulate LLMs by feeding hundreds of harmful examples, highlighting a critical vulnerability amplified by larger context windows.