I am a newbie who is interested in using the Assistants API to help me review a one to two page document or image, compare it to a somewhat complex set of design guidelines, and have the Assistant identify any non-compliant elements.
For example, the guidelines require specific ordering of information, language for headings or disclaimers, spacing, etc. Some of the guidelines are also conditional on whether the document does or does not contain certain information.
When creating an assistant, I’m not sure if it would be more helpful to use functions, embeddings, or retrieval. Or maybe even something else. Any thoughts?
Edit: I should have mentioned, some of the elements are visual, and some of them are textual.
As everything in software development, the answer is, “it depends”. But here are a few questions that can help you get to an initial answer. Your problem involves understanding two things:
- Design Guidelines
- The 1 to 2 page document (the API doesn’t support images yet, so you’ll have to focus on documents for now)
A good question to ask yourself first is which of the two should your system “learn” and which one it should “search”. For example, should it become an expert in the design guidelines, and then search through the document for evidence of the guidelines? Or, should it become an expert in the document, and then search for guidelines to see if the document is implemented right.
Developing “expertise” is better done using functions. You send the function the document or subset of the document you want it to consume, provide some context on how to understand/learn about it, and what to do with the information. “Searching” is better done with retrieval methods such as RAG.
Hope these thought starters help you at least define the questions you need to answer (usually 50% of solving a problem). Happy to provide more feedback if you provide more detail.
This sounds like prompting problem. So I would think you do need functions or tools. But getting the right prompt will take some time. Be very elaborate and specific. I have prompts that are several pages long.
If in doubt your prompt is probably not specific enough