I’m experiencing a persistent issue with object counting using OpenAI vision models:
The problem: When analyzing the exact same image with 28 coins:
- ChatGPT UI (o4-mini, o3, GPT-4o): Consistently counts 28 coins correctly
- API/Playground (o4-mini, o3, GPT-4.1): Always returns incorrect counts (25, 30, 35)
I’ve extensively tested various parameters in API calls:
- Different temperature values (0-0.5)
- All reasoning_effort settings
- Adjusted max_tokens (10-4000)
- Various prompting strategies
- Stripped down system prompts to bare minimum
Despite identical images and near-identical prompts, the UI consistently succeeds where the API fails. Our backend uses openaiService.js with a standard system prompt that we’ve progressively simplified.
Has anyone else encountered this discrepancy between UI and API for vision counting tasks? Are there hidden UI parameters or different model versions being served?