Ai Insights for Everyone

Harrison Painter 4/8/24 Harrison Painter 4/8/24

The Ingeniously Simple Hack That Can "Jailbreak" Ai Language Models

New research from AI company Anthropic reveals a simple "jailbreaking" technique that can bypass safety constraints of large language models by including many examples of harmful prompts, exploiting their growing input context window sizes.