Chained Prompt to complete text larger than 4000 tokens?

We want to use OpenAI (Text Completion) API to handle large document (100 pages for example) to answer multiple questions based on the document. Since there are 4000-token limitation, we will need to split the whole text into smaller chunks and send to OpenAI API. But we do not want to do summarization for each chunk since it can likely lose details that our later questions may need. The question is, can OpenAI API handle “chained prompt”, meaning maintaining the chucks we send into a context and answer our questions based on what we send previously as a whole context? To accomplish that, we also need ‘session’, so one document will not interfere with another.



plus 1, having similar use case to resolve.

As far as I know, there’s no way to “chain prompts” together (ETA: With the basic API…) What you send in the prompt is what is used to generate the completion.

Hope this helps.

We have a way to do this (We are chaining between 20 and 100 prompts to refine and answer). However, it is not cheap to do and the use case would need to justify it

Which problem are you trying to resolve

a) Are you trying to get a longer output
b) Are you trying to refer to more of the document
c) Something else

The solution for each of these issues are not the same

Send me a private message if you want to chat offline

1 Like

Longer input… In other words longer context.

1 Like

we are facing a similar issue.

we need way more context and it is impacting the size of the output.

we were trying fine tuning to give it context, but that did not work.

1 Like

an amazing use case is it to analyze large datasets, because generally this AI is very smart, and it can really summarize and find patters on large datasets if openai will let us.

1 Like

+1 exactly what @adamrg72 said

data → open AI → Summary

@all seems like the issue is now resolved with gpt 3.5 and 4? Where we can provide context? any comments?

Hi All, Was anyone able to solve this issue? I have the same problem. I need to send a good amount of tokens, sometimes more the 4000 for just input and I need output which could/would be around 2000 or less. It would be a great help if anyone has found a way around it and could share. Thanks

This discussion is from before models gpt-3.5-turbo-16k and gpt-4 (8k) had even been available on the API. So it is OpenAI that solved the issue for token contexts and outputs like you describe (although the AI will be reluctant to output that much without a good reason).

Thanks for the reply, I am trying to process around 5000 words but it is not working. Like I have some text from email and some pdf files. I want to feed all the text and get back a summary having a small description of all the work and the costs. The output might not be that long. The total tokens of input and output are exceeding the current limit which is 4000 for newer models if I am not wrong. Could you point me to the solution if you know? clearly, I am missing something here. Thanks much

In the API, you would know exactly what model you picked, and should be aware of its settings. You can use gpt-3.5-turbo-16k with a setting of max_tokens=3000 and have 13000 remaining for the input, which is near 10000 words (although the overall understanding will diminish the more you send as input to one query.

1 Like

I am going to try this. Thank you so much. The context option might work as well. Thanks again