How much output can the GPT-4 API really do?

qrdl · May 15, 2023, 6:36pm

I ran across this post by NRU:

If anyone is curious if the 32k context window can be used for output. The answer is yes, I just tried a translation task into 9 different languages.

That’s cool, but when I ask for code that’s longer than say, 1000 tokens, I get the message “As an AI language model, I can’t…”

Is there a reference / paper / blog about this topic? Has anyone done any detailed investigation into the GPT-4 api and how much output it will provide?

anon22939549 · May 15, 2023, 6:51pm

It will output as much as it needs to until the end of the response or the token limit is reached.

qrdl · May 16, 2023, 2:07am

Interesting thread, I haven’t encountered this yet though.

qrdl · May 19, 2023, 11:14pm

What I am concluding is that things really break down badly when you try to output a lot. Hallucinations, lack of reasoning, it’s all there.

The best approach is to always generate small snippets of code/text. Ensure that your context is as much as possible but only the absolutely required context to do the task. This latter statement is obviously hard to satisfy, but important, as its very likely reasoning degrades with more context.

One possibility is to do a sort of COT approach where a separate prompt is responsible for extracting the required context before another prompt reasons about it.

N2U · May 20, 2023, 11:55am

I think I should add a bit of additional context here since I ran the translation task

I’m (very obviously) not able to read 9 different languages so I placed the smallest/most uncommon languages I can read at the bottom of the task, here’s the exact prompt I used:

your task is to translate the user's input into nine different languages: Mandarin Chinese, Hindi, Spanish, French, Arabic, German, Finnish, Swedish and Danish. 

The user's input will be in English and your task is to provide accurate translations for that input in the mentioned languages.

For each input, generate the translations following this specific format:

\\\
English:
"Translation in English"
Mandarin Chinese: 
"Translation in Mandarin Chinese"
Hindi:
"Translation in Hindi"
Spanish: 
"Translation in Spanish"
French: 
"Translation in French"
Arabic: 
"Translation in Arabic"
German: 
"Translation in German"
Finnish: 
"Translation in Finnish"
Swedish: 
"Translation in Swedish"
Danish: 
"Translation in Danish"
\\\

Ensure each translation accurately corresponds to the user's input. Do not provide any phonetic transcriptions or pronunciation guides - only the translated text is required.

Note that this task may be an optimized example, as it essentially repeats the same context again and again just in different languages.

I’m very cautious about using this example to infer other things about the 32k context window. I only did this to force as many output tokens as possible

Topic		Replies	Views
Impossible to generate texts of more than 600 words API	5	3300	December 18, 2023
How to print the output over 10,000 tokens? API gpt-4o-mini	4	733	September 9, 2024
Gpt-3.5-turbo-16k Maximum Response Length Prompting api	33	34617	December 13, 2023
GPT-4 128K only has 4096 completion tokens API gpt-4	9	27128	February 27, 2024
What is the maximum response length (output tokens) for each GPT model? API	6	41199	November 7, 2024

How much output can the GPT-4 API really do?

Related topics