Assistants API "quote" field missing from the "message" object under "file_citation"

Hello,

In the api documentation
https://platform.openai.com/docs/api-reference/messages/object

States that a message object under the file_citation field should contain a quote field with the chunk of original text used to generate the response.

But the “quote” field doesn’t exists in the message object.

Because the file vectoring is done by OAI is difficult to get the original text if is not provided as mentioned in the documentation.

When is OAI planning to restore this field or suggest a temporary workaround ?

Thank You.

3 Likes

I was ljust ooking at the changes from 1.33.00 → 1.34.00 and that is the ONLY change. Comparing v1.33.0...v1.34.0 · openai/openai-python · GitHub

So assuming you’re using Python you could stick with 1.33.00 for now.

Hello,

I’m just using directly the REST API without any library.

The results I mention are taken directly from the HTTP OAI Assistants API response without any intermediary, then doesn’t depend on any library version.

Thank You anyway!!!

Clearly they are changing something - there are a lot of recent messages about missing annotations since the v2 release.

Yes, and one of the reasons we use OAI for RAG is that they take care of the files vectoring, but very frequently get the original text is needed, then if AOI delays solving the bug, we should look for another solution.

Just tested on 1.35.3 and it appears the annotations with quotes are back. However, deciphering the notations like 【4:0†Dracula.pdf】still remains a bit of a challenge. Any insights?

I am also using 1.35.3, but the quote attribute is still misssing. I get this message: “‘FileCitation’ object has no attribute ‘quote’”. My app, based on v1, worked fine. However when I first migrated it to v2, it returned None. Now it doesn’t even have such an attribute.

Parsing:

Annotations is part of the messages “text” return object, and is a list (array) that must be iterated over:

{
“id”: “msg_abc123”,
“object”: “thread.message”,
“created_at”: 1699017614,
“assistant_id”: null,
“thread_id”: “thread_abc123”,
“run_id”: null,
“role”: “user”,
“content”: [
{
“type”: “text”,
“text”: {
“value”: “How does AI work? Explain it in simple terms.”,
“annotations”:
}
}
],
“attachments”: ,
“metadata”: {}
}

Each annotation list item has its own object structure, and “quote” is not required.

MessageDeltaContentTextAnnotationsFileCitationObject:
	title: File citation
	type: object
	description: A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "file_search" tool to search files.
	properties:
		index:
			type: integer
			description: The index of the annotation in the text content part.
		type:
			description: Always `file_citation`.
			type: string
			enum: ["file_citation"]
		text:
			description: The text in the message content that needs to be replaced.
			type: string
		file_citation:
			type: object
			properties:
				file_id:
					description: The ID of the specific File the citation is from.
					type: string
				quote:
					description: The specific quote in the file.
					type: string
		start_index:
			type: integer
			minimum: 0
		end_index:
			type: integer
			minimum: 0
	required:
		- index
		- type

The definition of “the specific quote” is not explained. If it refers to a chunk number, that would be impossible for you to determine and provide on your own as there is no method to access or recalculate the chunks of an extracted document.

‘The specific quote’ refers to the part that is actually quoted in the content of the searched document. I have a screenshot showing that the app, using v1, worked, which is attached below. The content of the popup box is the quote in question.

Up to a few minutes ago, I’ve been using version 1.30.3, and now I’ve upgraded to 1.35.7. Between these versions, the quote field has gone away. Using version 1.35.7, a FileCitation object now prints as:

FileCitation(file_id='file-DYySkitPa1jE5kh****6olEX')

whereas before it had a quote field… which always was populated with None. That was the reason why I upgraded my version in the first place.

Suffice it to say, I am left wondering how, or whether, the quote field is going to be provided in the future, which is disappointing because without this feature, the file citation tool is a lot less useful!

1 Like

I am also interested in knowing if citation quotes will be supported in the future! I’m disappointed to see that it has been removed.

The AI no longer has retrieval’s method to “mark” lines from text that could be returned, nor the ability to explore within documents (at your expense). The only thing it can do itself is receive chunks and answer from them, citing a file+index received back.

If a FileCitation(file_id=... is a newly-generated results file based on a chunk, then you may be able to download and provide that, which may be the future intention. I’ll let someone else invest their time to see the current state.

Hello,

Yes, but now there is no means to retrieve the original chunk of the cited document, so far I know.
Because OAI manages the vectorizacion process end to end.

The start_index and end_index in annotation points to the RESPONSE, not the original file …

When OpenAI will solved this problem?

1 Like

Maybe start using structured data requests and include any fields inside the structure example of your response and parse the responses structured Data… Making quote and citation fields depreciated, AKA, you’re problem? Make sure to request the chunk in your structured data output. It… Might work.

1 Like