The documentation indicates that the Assistant API can call multiple tools as part of a Run cycle in response to a user query, but I don’t see any examples of this. I’m trying to understand the intended architecture for a multi-stage query.
As an example, let’s say I am working on a foods-as-medicine disease prevention app. There are three main sources of knowledge. Assume that the following JSON files all have classes that serialize the contents into objects that can be used for encoding or preprocessing in any way we desire.
-
diseases.json - an array of
Disease
objects. Each object has a name, array of alt_names, and an array of nutrients that fight the disease. Each nutrient has an efficacy score of 0-10, and a short description of why it’s effective. -
nutrients.json - a truncated list of
Food
objects, containing the name and their nutrient contents (fat, calories, protein, vitamin a, magnesium, etc) -
recipes.json - a list of
Recipe
objects, containing a name, category, ingredients, and yield.
I would like a user to make a complex query to the Assistant like: "I have the flu. Give me some soup recipes."
This query would involve multiple stages:
-
Identify the most important nutrients for the disease.
-
Identify the soup recipes with the highest concentration of these nutrients.
-
Process the result and format it in natural language for the user.
What I’m confused about is how to combine the semantic capabilities of the assistant with data retrieval and analysis. Vector databases don’t allow for sorting,
Solution A:
Multiple Tool Functions
Idea: implement functions to analyze the data sets: get_nutrients_for_disease(disease_name)
, get_top_recipes(type, nutrient_arr)
. The Assistant should call get_nutrients_for_disease, then that output should be used to call get_top_recipes, with that result being used to construct the response to the user.
Problem: How can I ensure that the functions are called in the correct order? This involves a lot of work on the backend, and doesn’t seem to really be leveraging the advantages of GPT. It’s nothing more than a natural language interface to a traditional service.
Solution B:
Preprocessed Data
Idea: Using Chroma or Pinecone, create a vector database for RAG. The VDB would have the following categories: nutrients
, recipes
, diseases
. The recipes collection would contain an enhanced version of the Recipe
class that contains pre-calculated scores for each disease, along with the nutrient amounts derived from the ingredient list.
Problem: How could I handle sorting queries, like: What are the five foods with the most vitamin a?
You can’t sort by metadata fields. Is there some way to have a kind of associative database or pre filtering method? Users may also form a query that a preprocessed field cannot satisfy, and the Assistant will likely hallucinate as a result. It also requires updating the entire collection if I add more diseases or change any scores.
It feels like I’m misunderstanding something or missing a crucial part of the toolchain. Maybe the solution is neither of these? How are we meant to build relative complex applications that involve multiple steps?