I have been tasked to build a chatbot for a company, it is supposed to be a knowledge base for customers to ask questions instead of writing a support ticket. I already have a JSON file with all the information the Chatbot needs. I am using Assistant with its API. Furthermore, I uploaded the document with the vector_search tool. When I ask example questions in the OpenAI Playground, the response times are fairly decent. But once I try to use the exact Assistant through the Assistant API V2 (Python). The answering times double.
I have multiple questions now:
-Is the Assistant the best tool for my use-case? Is chat completion a better option?
-Could the slow response time be tied to me using the Python library? Could it be faster by using curl and Rust for example?
-Could it be that due to a longer instruction, the response time is slow?
-Is it advised to make your own vector-database instead of using the vector_search?
-Are there any other steps I can take to ensure faster response time while still getting accurate information?
Thank you very much for taking the time to read this!