How to increase output length or have it continue the conversation?

I’m using gpt-4o, but the current max output length of 4096 isn’t high enough for my needs.

I’m asking gpt to create a 10-day travel itinerary for various countries and returning the result in a formatted json object, but more often than not it hits the output limit and stops returning json.

Here is the prompt I’m giving:

Please write a detailed 10-day travel itinerary for Croatia. Divide the article into individual days, and write several paragraphs for each day in the itinerary. In each paragraph, provide as many suggestions as possible for places the reader should visit on each day.

Return the response as json in the format:
        
        {
          "days": [ // Each day should have its own entry in this array
            {
              "title": "", // For example, "Day 1"
              "paragraphs": [], // Return 2-4 string paragraphs. Bold every single place name, like "**Dubrovnik**", in each paragraph.
              "places": [ // Return a list of places used in the paragraphs
                {
                  "name": "", // The place name used in the paragraph. Must match the name in the paragraph exactly.
                  "coordinates": "" // You must include latitude and longitude coordinates of the place in the format "latitude,longitude", for example "-14.48051,46.64408"
                },
              ]
            }
          ]
        }

Often times it will get to day 6 or 7, and run out of output tokens.

Using the API, how can I either increase the max output tokens, or have it continue exactly where it left off?

Here is my current php code:

$chat = $open_ai->chat([
    'model' => 'gpt-4o',
    'messages' => [
        [
            "role" => "user",
            "content" => $prompt
        ]
    ],
    'temperature' => 1.0,
    'max_tokens' => 4096,
    'frequency_penalty' => 0,
    'presence_penalty' => 0
]);

Hi,

As the input is larger, you can check for the “length” finish_reason and if you encounter it you know output stopped before it was terminated, then pass the generated output so far as input with the prompt “please continue this from the last line”, it should continue generation of the next part, repeat until you are finished with a finish_reason of “stop”

4 Likes

Hey Patrick, You could use the threads feature to obtain your output.
If you want to continue your output, you could just prompt it to do just that and it will continue it, as the thread history of messages is always maintained.

(Try to consume tokens ie. input and output, less than half of the total Context Length of the LLM for proper functionality…else the threads feature wouldn’t work as expected.)

Hope this late response, helps you out.