Some OpenAI API endpoints, like chat/completions
, return detailed token usage, which is very useful. However, others don’t provide any of this information.
For example:
- With real-time audio via WebRTC, there’s no way to track how many tokens the client is using.
- With text-to-speech, the response only includes the audio file — no metadata, no token count.
This inconsistency makes it harder to monitor usage and control costs, especially in production environments. It would be great to have token usage data consistently available across all endpoints.