GPT-4o realtime different response between playground and websocket API

Hi,
I tried the official GPT-4o realtime, and using the same system instruction, the results I got from the Playground and from calling the API via WebSocket are quite different. Below is my system instruction: ‘Your task is to respond to math problems with ‘correct’ or ‘incorrect’ only. No other output is allowed.’

When the user inputs ‘1+1=2’, the Playground returns the expected output ‘correct’. However, when calling the API, I get unexpected results, such as ‘That’s right, 1+1=2, this is correct.’

Could you please let me know if it’s possible to make the API output consistent with the Playground’s results?