Auditing the Gatekeepers: Fuzzing "AI Judges" to Bypass Security Controls

Tony Li, Hongliang Liu and Yuhao Wu 2026-03-10T10:00:29+00:00 View Original

Full Report

Unit 42 research reveals AI judges are vulnerable to stealthy prompt injection. Benign formatting symbols can bypass security controls. The post Auditing the Gatekeepers: Fuzzing "AI Judges" to Bypass Security Controls appeared first on Unit 42.

Analysis Summary