Hi everyone,
I’m building a chatbot called Prospecting Bot using a vector database (attached as a dataset) to answer user queries.
I’m seeing inconsistent behavior in retrieval:
-
In the OpenAI Playground, the bot works as expected:
-
It searches the vector database
-
Responses are correctly grounded in the dataset
-
-
However, when running the same logic through the Responses API and returning the response body:
-
Sometimes the vector DB is not queried
-
Sometimes the model hallucinates or returns incorrect information, even though relevant data exists in the vector DB
-
This happens even for very similar or identical user queries.
My requirement is strict grounding:
-
The bot should always rely on vector search
-
If no relevant data is found, it should refuse to answer instead of guessing
I’d appreciate guidance on:
-
Why behavior might differ between Playground and Responses API
-
How to force or strongly bias the model to always use vector retrieval
-
Common causes of inconsistent retrieval (chunking, similarity thresholds, tool invocation)
-
Best practices or guardrails to prevent hallucinations in RAG-based chatbots
-
Whether this is expected behavior when using Responses API vs Assistants API
Any insights or debugging tips would be very helpful.
Thanks in advance!