Seeking the Best API Choice: Should I Use OpenAI's Assistant API or Chat Completion API?

I am currently using OpenAI’s API but am unsure whether to use the Assistant API or the Chat Completion API. My goal is to enable users on WordPress to interact with my fine-tuned model (such as GPT-4 Mini) and incorporate some files as part of a Retrieval-Augmented Generation (RAG) system. This API needs to handle high traffic, and the frontend will remember each user’s conversation history, allowing them to either continue previous conversations or start new ones. The overall functionality is similar to the ChatGPT web interface, but I will develop it as a WordPress plugin, utilizing the fine-tuned GPT-4 Mini model with RAG integration. Does anyone have better suggestions?

3 Likes

Welcome to the forum!

Assistants are easier to set-up, but you give up a little on control. Rolling your own RAG solution is more difficult but gives you a lot more control.

There might already be a WP plugin?

3 Likes

Thank you for your reply. I’m glad to join the forum! While it sounds like an assistant might make things more convenient, what kind of control might be lost in the process? Do you have any tips for this kind of development?

Additionally, most other WP plugins are designed for administrators, focusing on optimizing posts, SEO, etc., but I want to create one that allows users to use GPT models in WordPress just like on the ChatGPT website, and also offer additional services.

Control over what’s included in the context. Assistants do it automatically, but sometimes it’s not as good as when you generate the prompt on your own with a RAG system.

Gotcha. I thought there was something like this available, but I haven’t looked recently. If you come up with a solution, definitely let us know.

1 Like

Paul knows what he’s talking about. One thing to note though is that you can still build your own RAG with Assistants using function calls and it works well. I personally like the thread management that’s built into Assistants so that’s why I started with it but you can do a custom implementation with Chat Completions

I personally disabled the Assistants built in retrieval and set up my own because it wasn’t good enough.

Thank you for your reply! I’ll definitely keep you updated if I come across any better solutions.

Thank you for your response; it taught me a lot about the Assistant API. Could you please share how you implemented RAG with function calls to work with the Assistant API? I really appreciate it!

I did implement RAG on Release Updates of Features and used Function Calling to send the whole conversation I had with Assistant via Email. If you wanna go through these two items, check OpenAI Assistant V2 and Function Calling using OpenAI Assistant V2.

Let me know if you have any specific questions around Function Calling.

2 Likes

Thank you for the clear and concise videos. I watched both of them! They helped me better understand how these concepts work together. Thanks again!

1 Like

The videos from MrFriday can explain it much better than I can so definitely go off of those. To simplify the concept down as much as possible you just need a function for your assistant where it will supply a query term or phrase. Then on the back end, when this function is called you’ll use the query to pull relevant documents from your vector database and return them. You can improve the retrieval with multiple different methods. This article has a few different examples starting around section 2 Advanced RAG Techniques: Unlocking the Next Level | by Tarun Singh | Medium

1 Like

Use both.

See here (Switching from Assistants API to Chat Completion?) for more in-principle arguments.

See here (GitHub - icdev2dev/selfet) for some implementation level details.

See the video at the top of the post to see how conversations are modeled at roughly 5:20. The conversations are specialized threads (with Metadata) (Selfet -- Towards Fully Autonomous Multiple Agents)

2 Likes

Depends on what you want to do. Chat comp is better for one and done prompts because it’s faster and doesn’t require a setup an assistant. However assistants work better for conversations. I typically use a hack where I setup an assistant with instructions and I use both methods …but for chat comp I pass the instructions of the assistant into the prompt to still leverage my training.

1 Like