Questions about Assistant API, Errors, Batch & Rate Limit

sim88 · March 5, 2025, 12:52am

I am doing a project for object categorisation with categories stored in vector store. (5.5k of unique categories) using gpt-4o-mini as main model;
My questions are:

How to reduce hallucinations?
Is it possible to use batch processing with the assistant API instead of asynchronous processing?
Why does the status sometimes return as “failed” on the first run for some items, but generate different outputs on the second run with the same input?
Is there any batch functionality available for the assistant API?
How can I determine if my usage of the assistant API has hit its rate limit?

_j · March 5, 2025, 6:49am

You can reduce hallucinations by only asking what the AI can know, and instructing for the AI to return an error if the knowledge isn’t directly repeatable from file search tool return if that is where you are having issues.

The AI must make its own search query in order to get information from a vector store, and then chunked data is returned of semantic similarity. Text of a bunch of categories split into chunks that the AI must make extra tool calls to obtain will result in a poor-performing application.

To get fine grained error information, list and check the run step objects

{
  "id": "step_abc123",
  "status": "failed",
  "failed_at": 1699063291,
  "last_error": rate_limit_exceeded,

You don’t get to observe the API or token rate limit, and the assistants API disregards any API organization model rate limit, calling iteratively until exhaustion. There’s no header returned with how many Assistants endpoint API calls per minute are left (which can be quite low, like 60 RPM).

There is no batch processing. Assistants is inherently multi-turn and is not a model, but rather a computer program that calls models.

Topic		Replies	Views
User Experience Woes: Highlighting Issues with Assistant API Usage Feedback gpt-4 , assistants	2	1362	November 19, 2023
Does OpenAI not chunk my documents in vector store? API gpt-4 , assistants-api , vector-store	1	235	November 11, 2024
Assistants api token counts API assistants-api	6	489	September 17, 2024
Limitations of assistants and threads API assistants-api	3	593	December 3, 2024
Assistants API Performance API api , assistants-api	11	2869	March 21, 2024

Questions about Assistant API, Errors, Batch & Rate Limit

Related topics