Skip to main content

🏢 Contramont Research

Unelicitable Backdoors via Cryptographic Transformer Circuits
·1600 words·8 mins· loading · loading
AI Theory Safety 🏢 Contramont Research
Researchers unveil unelicitable backdoors in language models, using cryptographic transformer circuits, defying conventional detection methods and raising crucial AI safety concerns.