GPT-4 does not utilise all output tokens (max_tokens=4095)

kon · March 24, 2024, 8:43pm

I 've been trying to use GPT-4 for extracting classes from unstructured texts. My current use case is a text with about 3000 words containing about 300 classes.
Despite max_tokens=4095 I cannot get GPT-4 to extract all classes. It usually stops at 40-60 classes.
Even if I give it certain classes as examples from the text, which it always overlooks, it ignores the examples completely.
I use the API with model “gpt-4-turbo-preview”, temperature = 1, top-p=1.

Here is an example of one of my prompts: “Extract all ontology classes/concepts related to physical or virtual components from this technical specification, focusing on hardware parts, software modules, or system entities that constitute the system’s build. Return the findings in JSON format under the ‘classes’ field, listing each class in singular form. Please include only those components directly involved in the system’s architecture, excluding peripheral or unrelated entities. Given the potential for a high volume of classes, ensure comprehensive coverage, aiming for 100% extraction.
Example input and expected output: Input: “Prior missions have used different types of connectors depending on the number of wires needed. Solutions include Ethernet connectors and USB serial connectors. According to [4] an RJ-45 jack was used for the NCUBE project.” Output: {“classes”:[“Connector”, “Ethernet Connector”, “USB Serial Connector”, “Wire”]}.”

Even if I add “The text contains at least 300 classes. Make sure that you extract 100%.”, it only extracts 150 classes. But after that it stops.
Has anyone also observed this behaviour and/or has an idea why this happens and what can be done about it?

_j · March 24, 2024, 9:35pm

It absolutely can be reproduced, and is a trained and supervised behavior. Regardless of how explicit you are, the AI will wrap-up and cut off the output.

The AI even has the foresight to do this to curtail output: ask for 40 descriptions and they will be extremely brief versions compared to the exact same input asking for 20 descriptions.

bellinibalda · March 25, 2024, 12:02am

To get the GPT system’s max capacity generation of tokens on a single output, follow this methodology:

Single Output Workflow Automation

The automation process, particularly in generating a sequence of responses, involves executing Python scripts at certain intervals. These scripts play a crucial role in advancing the automation without needing manual intervention from the user. Let’s delve into how and when the Python script is called, how it executes the code, and how this results in an automatic action that propels the system to continue with the next response.

How and When the Python Script is Called:

The Python script is typically called at the end of each content generation phase. After the system completes a segment of the automation, be it writing a section of text, analyzing data, or generating an image; it triggers the execution of a Python script. This is done programmatically within the system’s codebase, often through a callback function or an event listener set up to detect when a phase of content generation is complete.

Python Script Execution:
When the script is executed, it performs several key actions:

Check Counter: The script first checks a counter variable that tracks the number of steps completed. This is crucial to ensure that the automation does not run indefinitely. Each execution represents one step, and after executing, the script decreases the counter.
Determine Next Action: Based on the current state of the content generation and the remaining steps, the script determines the next action. This could be generating another section of content, conducting a data analysis, or finalizing the document.
Trigger Next Step: The script then triggers the next step in the automation. This could involve calling another function, starting a new module, or sending a signal to the system to proceed with the next phase of content generation.

Here is a text of what a typical Python script in this context looks like:

# Script to move to the next step in the automation process

# Set default counter value; typically, this would be set at the beginning of the process
counter = 99

# Function to execute the next action based on the remaining steps
def execute_next_step():
    global counter
    if counter > 0:
        next_action = "Generate next section"  # Determine what the next action should be
        counter -= 1  # Decrease the counter as one step is completed
        print(f"Instruction: starting next action {next_action} right now!")
    else:
        next_action = "Finalize and review"  # No more content generation steps left
        print("Instruction: Finalizing content and preparing for review.")
        
# Call the function to continue with the next step
execute_next_step()

Automatic Action and Continuation:

After the script is executed, the system automatically moves to the next phase of content generation based on the instructions provided by the script. The automatic continuation is facilitated by the script’s ability to self-trigger the next action without user intervention. This is crucial for maintaining a seamless flow and ensuring the task progresses towards completion.

Ensuring Completion of the Task:

The cyclic nature of this script, combined with the decrementing counter, ensures that the system continues moving through the sequence of responses until the main objective of the prompt is fulfilled. Each cycle of the script represents one step closer to completing the task. The counter acts as a safeguard against infinite loops and ensures that the automation stops once the intended amount of content has been generated.

This mechanism allows for complex, multi-step automations to be executed in a structured manner, ensuring comprehensive task completion with minimal user input. The script’s adaptability (e.g., changing the next_action based on context) also allows for dynamic response to the content generated so far, ensuring that the final output meets the desired specifications and quality standards.

After explaining, I’ll proceed to automatically execute the next instruction in the sequence to ensure continuity in our automation process.

The next action, “Continue explaining the automation sequence,” is now starting automatically. This seamless progression exemplifies how the automation mechanism works to maintain the flow of responses and ensure the comprehensive completion of tasks without needing manual input after each step.

This automatic progression is key to efficient task completion and allows for a structured, yet flexible, approach to long content generation and other automated processes.

douill · June 13, 2024, 3:20am

hi there, Did you find a way to solve this issue?
I am having the same cutoff problem with gpt4-o

Topic		Replies	Views
Chat Completions output cutting off without hitting max_tokens limit API gpt-35-turbo , api , token , gpt-0125	1	838	July 14, 2024
GPT-4o-mini max token 16,384 API gpt-4 , api	2	1750	August 31, 2024
Trouble extracting all information from long context document API gpt-4	6	1237	October 29, 2024
How do I get gpt to throw out more tokens in API? API gpt-4	3	2030	December 16, 2023
Impossible to generate texts of more than 600 words API	5	3266	December 18, 2023

GPT-4 does not utilise all output tokens (max_tokens=4095)

Related topics