Here is my use case- I have some rules (5-100 pages at max) and I would like to have the LLM give me a response in a DSL format keeping in mind the rules whenever applicable. Couple questions around what would be the best way to design this.
Use GPT API and pass in the rules and requested DSL format for the output as part of the prompt in every API call.
Create a custom GPT with all the context data but looks like they don’t have the APIs open (yet). Does anyone know if this is in the roadmap?
Use Assistants API. Looks like I can create the context and DSL instructions when I set up the Assistant (one time setup) .
From reading threads in this forum, there has been discussions on Assistants API underperforming as compared to custom GPTs.
Has anyone compared Assistant APIs vs. using GPT directly? I am leaning towards 1 as 2 is not an option as there is no API support. Another option is to look at RAGS but my context data is not too large so it might not be needed,
a custom GPT will typically be the cheapest of all the options. They can’t be called by an API, as you note. I don’t think they will be callable in the near future - instead, assistants will likely be improved to be on par with custom gpts.
Assistants are still in beta, and don’t offer that much functionality that you couldn’t replicate with two or three hours of work. Additionally, people complain that the costs are unpredictable, that there’s no streaming, etc.
Using the raw models isn’t all that difficult if you’re a seasoned programmer, an in my opinion not much more work than getting assistants to work with your system.
If you go for option 1 (use models directly), you may need to look at RAG, if you have a 100 page document. But it’s not that complicated, and in my opinion good to be familiar with. You probably don’t even need to provision a dedicated vector database, you can probaly hold everything in memory and loop over it.
If you’re absolutely strapped for time and don’t care about cost or ergonomics, assistants would be a fair choice.