How to Prevent Hallucinations When Extracting Verbatim Text from Files Using OpenAI Assistant API

Hello everyone,

I’m working with the OpenAI Assistant API in a Node.js environment, and I’m encountering an issue with getting accurate reference information from files used in the file search tool. Specifically, I want the assistant to extract verbatim text from the file without hallucinating or providing inaccurate references.

Context:

  • Programming Language: JavaScript/Node.js
  • Current Implementation:
// Add message to thread
const createdMessage = await openai.beta.threads.messages.create(
  threadId,
 {
          role: 'user',
          content: `${message}\n\nAfter each answer where you reference files from your knowledge, you must include references in this format:\n\n
          ### References
          [filename of file referenced]
          > Extracted content of the references\n
          If you don't reference any files in your answer, do not include the References section.`
        }
)

What I’ve Tried:

  • I included specific instructions in the message content to guide the assistant’s responses.
  • I included instructions in the top of the system message, as well
  • Got the idea from this community post.

Issue:

  • The assistant’s responses are inconsistent.
  • Sometimes it hallucinates the extracted content instead of providing verbatim text from the file.

Desired Outcome:

  • The assistant should accurately extract and return verbatim text from the specified file (a good example is Google’s NotebookLM ).
  • It should include references in the specified format without adding any hallucinated content.

Question:

How can I adjust my implementation to improve the assistant’s accuracy in extracting verbatim text from files? Are there best practices or additional parameters I should use with the Assistant API to prevent hallucinations and ensure consistent responses?

Any guidance or suggestions would be greatly appreciated!

7 Likes

same issue here, already using assistant api with just one docx in vector store.

tried low temperature (0.1) and tried various promptings.

My document is 34kb of text, so maybe it’s because it’s so long. But i don’t think so. It’s around 90k tokens, so should be within context window.

1 Like

I did not try the legacy v1 version of this API, supposedly it had a feature/parameter that would extract the verbatim text from the source. I also read elsewhere here that they are working on “putting this back” for v2 of this API (the one we are using now).

Hi,

  • Which model are you using? I suggest using a 4o or higher for the task. I found a lot of inaccuracy trying to use a mini for search and retrieval. It might be worth an o1 model as this task is more complicated than it seems.
  • How many files are loaded to the Assistant’s Vector Database? Keeping the search area focused helps.
  • It helps to include a guess in your user prompt for whatever you’re looking up: Exact names, locations in the document, other identifying features like the size of the font.
  • Keeping temperature low is helpful. You don’t need an excess of creativity.
  • The response should already include annotations and references.
  • You may need to adjust your chunking strategy. In either event, it’s helpful to turn on the reports generated from each search to get a sense of what is happening.
2 Likes