Using annotations for data returned by function calls

I have a hard time figuring out how to reference data returned from my function in the response text. Hope you can help.

I’m providing a couple of function calls to my assistant which will return some of our product data. I would love for the API to return annotations for the referenced data very similar to how it does it for file_citations already.

E.g. I have a function called getProducts which will return a JSON structure such as [{ "id": "c1", "name": "FooBar Computer", "price": 1000 }, { "id": "c2", "name": "Keyboard", "price": 50 }] which is then passed to the LLM.

Now, a user message such as “what’s the most expensive product?” would probably give a response such as “The most expensive product is our FooBar Computer with a price of 1000.”. That’s a great answer but I would really love to be able to reference this “FooBar Computer” which has an id of “c1” in our system.

It would be extremely nice if the response would include an citation to this so the response would be “The most expensive product is our FooBar Computer with a price of 1000”. Or maybe just a more regular citation tag such as “The most expensive product is our FooBar Computer [id:c1] with a price of 1000”. Doesn’t really matter to me if it’s some kind of markdown syntax or regular document reference but what I would like is an annotation response such as:

"annotations": [
            {
              "type": "function",
              "text": "id:c1",
              "start_index": 40,
              "end_index": 50,
              "operation": "getProducts"
            }
          ]

Any suggestions on how I can achieve something like this? Obviously I can do regex for the name and relate that to the data I know I’ve sent to the LLM but that’s brittle since we have many products such as “Macbook”, “Macbook Pro”, “Macbook Pro - UK Keyboard”, etc.

Annotations is only in assistants, and is only by OpenAI telling the AI to produce a particular format of text that the API backend can recognize. File_search now cannot cite particular parts of a document, just return documents based on the search chunk number or code interpreter files.

You can model what you have the AI emit and what you detect on what mostly worked in retrieval, having used numbered short lines in the document browsing:

Please provide citations for your answers and render them in the following format: `【{message idx}:{search idx}†{link text}】`.

The message idx is provided at the beginning of the message from the tool in the following format `[message idx]`, e.g. [3].
The search index should be extracted from the search results, e.g. # 【13†Paris†4f4915f6-2a0b-4eb5-85d1-352e00c125bb】refers to the 13th search result, which comes from a document titled "Paris" with ID 4f4915f6-2a0b-4eb5-85d1-352e00c125bb.

You have an easier path. You are in control of the return. It doesn’t have to be a document or JSON. You could return and instruct the exact text to reproduce if it is the result needed, like “[[C1 - FooBar Computer]], $1000.00”