Why the same input passes Guardrails in Agent Builder but fails in ChatKit

Hi everyone,

We’re seeing inconsistent Guardrails behavior between Agent Builder (workflow testing mode) and ChatKit integration.

When we provide the exact same user input:

  • :white_check_mark: It passes all Guardrails checks in the Agent Builder workflow test

  • :cross_mark: It fails Guardrails validation when triggered via ChatKit

The Guardrails configuration is the same.
The workflow logic is the same.
The input string is identical.

This makes us suspect that Guardrails may be evaluated under different conditions depending on the environment.

Some questions:

  • Does ChatKit inject additional system context before Guardrails evaluation?

  • Are Guardrails run on the raw user input in Agent Builder but on the full assembled prompt in ChatKit?

  • Is there a difference in execution order between environments?

  • Could model configuration or safety layers differ implicitly?

  • Is there any way to inspect Guardrails evaluation logs?

We’re trying to understand whether this is expected behavior or a configuration issue on our side.

Any insights would be appreciated :folded_hands: