Hello, I’m building AI-powered chatbots using the Responses API along with the OpenAI Vector Store API.
On my platform, users upload plain text documents (usually .txt
) that contain a list of products or services with their prices. The chatbot then answers questions based on those uploads.
Example queries:
-
“Do you guys have Soda?”
-
“Soda”
About 80% of the time the answers are correct, but 20% of the time the model produces irrelevant or nonsensical responses.
One user uploaded a file with ~50 lines (names of sales representatives + the geographic areas they cover). The expectation is that the chatbot can correctly say which rep covers a given area. However, the error rate in this case is unacceptably high, despite the file being very small and straightforward.
Question:
How can I reduce the error rate and make the responses more deterministic and accurate?
-
Would changing the input format (e.g., asking users to upload CSV instead of plain TXT) improve the retrieval and grounding?
-
Or is there a better approach for handling these structured, tabular-like datasets with the Vector Store API?
Any advice or best practices would be appreciated.