Hi Folks,
I’ve spent the last two months using the assistant’s API to build out www.docmonster.ai and
i wanted to share the best tricks i’ve found to get to somewhat production speed. This is like under 1.2 sec
Context
Model: Gpt -3.5 turbo
Tip 1 for speed:
So all messages have three parts:
Thread,
User message,
run,
Runs have messages on them and are within a thread. My biggest speed improvement came from splitting part one. Basically when the user opens my bot by clicking on the + symbol, it sends a message to the backend to initalise a thread and have it ready. Then when the user send the message, the run is added to the thread. Thread ends when the user closes the bot.
Tip2 for speed:
use promises instead of running them one after the other.
const [sentMessage, run] = await Promise.all([sentMessagePromise, runPromise]);
This made a big difference as well.
Tip3 for speed: More dynos
My backend was running on heroku and i was using one eco dyno. Adding 2 standard dynos made a world of a difference. don’t judge the api on the speed it has on your local server. even without streaming it’s not horrible!
Tip for Hallucinations:
I had to keep gaslighting my bot to make it retrieve from the file retrieve. It would keep saying it was unable to access files and then when i replied with “try again you have access, it works”
I added this to the instructions when my bot is created to solve for this: “If the myfiles_browser tool doesnt work the first time, try again till it works’”
So myfiles_browser tool seems to be the file retrieval tool’s name. So adding this instruction at the creation and the message level has made it stop telling me it couldn’t access my files.
It’s possible this is coincidence and the instruction doesn’t really help, but it’s consistently worked for me.
Anyways, thats it for me. Hope these tips were useful. If you want to try the bot’s speed you can do it on my product for free at www.docmonster.ai to see what i mean. Plus i’m launching what i think may be one of the first few production apps using the assitant’s api so wish me luck!