I’m just using directly the REST API without any library.
The results I mention are taken directly from the HTTP OAI Assistants API response without any intermediary, then doesn’t depend on any library version.
Yes, and one of the reasons we use OAI for RAG is that they take care of the files vectoring, but very frequently get the original text is needed, then if AOI delays solving the bug, we should look for another solution.
Just tested on 1.35.3 and it appears the annotations with quotes are back. However, deciphering the notations like 【4:0†Dracula.pdf】still remains a bit of a challenge. Any insights?
I am also using 1.35.3, but the quote attribute is still misssing. I get this message: “‘FileCitation’ object has no attribute ‘quote’”. My app, based on v1, worked fine. However when I first migrated it to v2, it returned None. Now it doesn’t even have such an attribute.
Each annotation list item has its own object structure, and “quote” is not required.
MessageDeltaContentTextAnnotationsFileCitationObject:
title: File citation
type: object
description: A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "file_search" tool to search files.
properties:
index:
type: integer
description: The index of the annotation in the text content part.
type:
description: Always `file_citation`.
type: string
enum: ["file_citation"]
text:
description: The text in the message content that needs to be replaced.
type: string
file_citation:
type: object
properties:
file_id:
description: The ID of the specific File the citation is from.
type: string
quote:
description: The specific quote in the file.
type: string
start_index:
type: integer
minimum: 0
end_index:
type: integer
minimum: 0
required:
- index
- type
The definition of “the specific quote” is not explained. If it refers to a chunk number, that would be impossible for you to determine and provide on your own as there is no method to access or recalculate the chunks of an extracted document.
‘The specific quote’ refers to the part that is actually quoted in the content of the searched document. I have a screenshot showing that the app, using v1, worked, which is attached below. The content of the popup box is the quote in question.
Up to a few minutes ago, I’ve been using version 1.30.3, and now I’ve upgraded to 1.35.7. Between these versions, the quote field has gone away. Using version 1.35.7, a FileCitation object now prints as:
whereas before it had a quote field… which always was populated with None. That was the reason why I upgraded my version in the first place.
Suffice it to say, I am left wondering how, or whether, the quote field is going to be provided in the future, which is disappointing because without this feature, the file citation tool is a lot less useful!
The AI no longer has retrieval’s method to “mark” lines from text that could be returned, nor the ability to explore within documents (at your expense). The only thing it can do itself is receive chunks and answer from them, citing a file+index received back.
If a FileCitation(file_id=... is a newly-generated results file based on a chunk, then you may be able to download and provide that, which may be the future intention. I’ll let someone else invest their time to see the current state.
Yes, but now there is no means to retrieve the original chunk of the cited document, so far I know.
Because OAI manages the vectorizacion process end to end.
Maybe start using structured data requests and include any fields inside the structure example of your response and parse the responses structured Data… Making quote and citation fields depreciated, AKA, you’re problem? Make sure to request the chunk in your structured data output. It… Might work.