Hi Team & Community,
I’m a huge fan of the recent multimodal advances (Sora 2!) and want to flag a key safety topic for discussion.
Concern: How are we monitoring for covert steganography in multimodal outputs? I’m thinking of models hiding data in imperceptible image noise, video artifacts, or audio frequencies.
Threat Model: This could enable unmonitored agent collusion, safety bypasses, or data exfiltration.
Question: Could you share insights on OpenAI’s current research or monitoring strategies for this threat?
Happy to help however I can, even if it’s just raising this issue. I care a lot about the success of this endeavor!
Thank you!