Anthropic offers $20,000 to whoever can jailbreak its new AI safety system

2025-02-06T15:54:00+00:00 View Original

Full Report

The company has upped its reward for red-teaming Constitutional Classifiers. Here's how to try.

Analysis Summary