Your question was “which API to use in my case”.
What I have been advocating for (and the code is open source) is to use both the Assistant API and the Chat Completion API. The rationale, in my mind, is simple.
Use Assistants, Threads and Messages from Assistant API) as a persistent store of instructions and interactions with the LLMs. Use ChatCompletion to do text generation.
One of the benefit is much better control over the interaction (i.e. one can ignore certain messages and focus the context on things that matter).
Talking about context, I have not used 5000 tokens to produce instructions and it may be overkill for an advanced LLM like gpt-4o, gpt-o1. This is because these LLMs understand much more than having to feed detailed instructions. Of course, do your own experimentation,
hth