Skip to main content

🏢 Anthropic

Many-shot Jailbreaking
·5721 words·27 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Anthropic
Long-context attacks easily manipulate LLMs by feeding hundreds of harmful examples, highlighting a critical vulnerability amplified by larger context windows.