How to Prevent Hallucinations When Extracting Verbatim Text from Files Using OpenAI Assistant API

doss · November 29, 2024, 9:18pm

Hello everyone,

I’m working with the OpenAI Assistant API in a Node.js environment, and I’m encountering an issue with getting accurate reference information from files used in the file search tool. Specifically, I want the assistant to extract verbatim text from the file without hallucinating or providing inaccurate references.

Context:

Programming Language: JavaScript/Node.js
Current Implementation:

// Add message to thread
const createdMessage = await openai.beta.threads.messages.create(
  threadId,
 {
          role: 'user',
          content: `${message}\n\nAfter each answer where you reference files from your knowledge, you must include references in this format:\n\n
          ### References
          [filename of file referenced]
          > Extracted content of the references\n
          If you don't reference any files in your answer, do not include the References section.`
        }
)

What I’ve Tried:

I included specific instructions in the message content to guide the assistant’s responses.
I included instructions in the top of the system message, as well
Got the idea from this community post.

Issue:

The assistant’s responses are inconsistent.
Sometimes it hallucinates the extracted content instead of providing verbatim text from the file.

Desired Outcome:

The assistant should accurately extract and return verbatim text from the specified file (a good example is Google’s NotebookLM ).
It should include references in the specified format without adding any hallucinated content.

Question:

How can I adjust my implementation to improve the assistant’s accuracy in extracting verbatim text from files? Are there best practices or additional parameters I should use with the Assistant API to prevent hallucinations and ensure consistent responses?

Any guidance or suggestions would be greatly appreciated!

timfosteman · December 5, 2024, 5:42pm

same issue here, already using assistant api with just one docx in vector store.

tried low temperature (0.1) and tried various promptings.

My document is 34kb of text, so maybe it’s because it’s so long. But i don’t think so. It’s around 90k tokens, so should be within context window.

nickm · December 5, 2024, 7:02pm

I did not try the legacy v1 version of this API, supposedly it had a feature/parameter that would extract the verbatim text from the source. I also read elsewhere here that they are working on “putting this back” for v2 of this API (the one we are using now).

thinktank · December 5, 2024, 9:29pm

Hi,

Which model are you using? I suggest using a 4o or higher for the task. I found a lot of inaccuracy trying to use a mini for search and retrieval. It might be worth an o1 model as this task is more complicated than it seems.
How many files are loaded to the Assistant’s Vector Database? Keeping the search area focused helps.
It helps to include a guess in your user prompt for whatever you’re looking up: Exact names, locations in the document, other identifying features like the size of the font.
Keeping temperature low is helpful. You don’t need an excess of creativity.
The response should already include annotations and references.
You may need to adjust your chunking strategy. In either event, it’s helpful to turn on the reports generated from each search to get a sense of what is happening.

akfeyes · January 14, 2025, 8:07pm

@thinktank Great feedback!

I’m curious though, what reports are you referring to in your last bullet point? I think this would be very beneficial in debugging a similar issue my team is having.

_j · January 15, 2025, 10:29pm

OpenAI in Assistants does not offer any facility to directly place PDF extracted contents into the AI context for immediate “attachment quality” interrogation.

It is only done by an AI’s tool call, and the AI must know even to “search” and to write an appropriate search query.

To then receive back top-ranked chunks, not a whole document.

When including a file into a vector store for file search, it is broken up into searchable pieces. Only the pieces are returned by a search. You can make the search chunks bigger though, up to 4k tokens.

Assistants’ available tools are not appropriate for directly asking about a total view of a PDF file. It can only be used for hopeful snippets of knowledge.

thinktank · January 16, 2025, 6:11pm

I’m talking about printing out the run steps.

Topic		Replies	Views
File_search assistants api - not returning full output, but just a preview of the output API lost-user , assistants-api , gpt-4o , file-search	4	210	February 4, 2025
The OpenAI console Assistant does not use or find some of the files uploaded in its file search zone API	5	337	October 10, 2024
Responses API file_search tool - issues and guidance API rag , file-uploads , file-search , responses-endpoint , responses	4	303	April 5, 2025
Issues with Incorrect Responses for Specific Legal Articles in Large Document Using Vector Store API api , assistants-api	2	139	December 28, 2024
Using threads vs chat completions API	4	2747	May 15, 2024

How to Prevent Hallucinations When Extracting Verbatim Text from Files Using OpenAI Assistant API

Related topics