Hello, I am trying to use the Realtime API (WebSocket) to analyze some images. I assume (hope) the main issue is syntax, as documentation seems limited.
I am trying to send each message as follows:
event = {
“type”: “response.create”,
“response”: {
“modalities”: [“text”, “image”],
“instructions”: message,
“images”: frames
}
}
ws.send(json.dumps(event))
However, it is not working as the on_message function isn’t event being called. It all works fine without images. Is this the right syntax? Or is there a different way to feed images? If anyone has found some way to feed images to the Realtime API I would really like to hear about your approach.