Inconsistent result with long context

tri · April 17, 2024, 3:06am

Hi there,

I am using chatGPT 4.0 Turbo to extract information from legal contract. The input contract is in Japanese with about 23k tokens.

My issue is with the same prompt and input contract, the extraction results are different in different runs. I set temperature = 0 and even I set top_p = 0.1, the results are still different.

Can anybody help me on this? Thank you.

_j · April 17, 2024, 3:31am

You can set top_p: 0.000001 - but still you will get variation in the highly-uncertain results of this task, because the AI model itself is not deterministic.

If you ask for reproduction of a section of a document like yours, consider: the AI has 20000 tokens where it might start reproducing, and uses attention masking where it can’t value or perceive it all at once.

Get the logprobs, and you will likely see that where the extraction starts, you have many possibilities from the document, all with low certainty.

You may want to reduce the input you provide, by using an embeddings model vector database for automatically placing chunks of the document, or a search function that can operate on the document. Even add a table-of-contents for the function that provides bookmarks. If you can inject the same reduced set of 1000 tokens of document each time, you will have an AI that will be more reliable.

The ultimate judgement for you to perform: is the AI correct or incorrect?

tri · April 17, 2024, 3:45am

Thank you for your answer.

Does that mean the result will be more consistent if I apply the technique like RAG for my extraction?

_j · April 17, 2024, 3:50am

More deterministic by RAG? I would think so.

OpenAI embeddings models are also non-deterministic, where top results of a semantic search have the potential of switching positions, but then you are searching on 100 chunks instead of 20000 tokens of uncertainty. OpenAI also is not the only provider of embeddings.

Ultimately it is about quality, not necessarily producing the same thing every time. The user input may not give you as high of a focus on the correct RAG result as were you to pass an entire document into the AI model context.

Embeddings to reduce AI input will also reduce your AI language model costs.

Topic		Replies	Views
Stocastic nature of GPT-4 Turbo Community gpt-4-turbo , azure	2	358	May 21, 2024
Extracting long text from document Community gpt-4	3	1274	April 22, 2024
Why is GPT-4 giving different answers with same prompt & temperature=0? API	6	16263	April 6, 2023
I get different answers to the same request API gpt-4 , gpt-35-turbo , chatgpt , api	2	5060	December 8, 2023
Text classification variation during the day Bugs	3	61	October 27, 2024

Inconsistent result with long context

Related topics