I’ve been using mini-4o and a chat completion to narrow down the selection of possible tools for a query to 0-2 (this takes about half a second), but even with mini-4o as the model for a Run, it still takes 5+ seconds for even a simple tool call. Has anyone found a way to optimize? How long are queries typically taking in your app?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Completion Speeds - How can we optimise speeds! URGENTLY! | 8 | 2219 | December 25, 2023 | |
API completions endpoint performance | 7 | 2084 | December 25, 2023 | |
Async Streaming Run Sanity Check | 2 | 182 | September 10, 2024 | |
Completion Speeds - ridiculously Slow - waiting over a minute | 4 | 1405 | May 17, 2023 | |
How can I improve response times from the OpenAI API while generating responses based on our knowledge base? | 3 | 23962 | November 9, 2023 |