Gpt4o not returning all indexes of the array despite prompting in various ways

Ric_2004 · September 14, 2024, 1:05am

I am sending array of strings delimited by a rare string to gpt. Like below
[#At_1# “string 1”, #AT_2# “string 2”…,]

I am sending specific instructions to creatively rewrite this data in the target locale. All this data is part of a user template. And it is very important that gpt returns all indexes since translated content will go back to UI at fixed indexes

But gpt randomly combine or split multiple indexes strings into 1 index, this reducing the number of indexes in the final output.

Need suggestions on how to enforce that all indexes be returned or I need to know which strings it combine or split to handle it at my end

allyssonallan · September 14, 2024, 1:45am

Could you explain more your case? We should get more details about the prompt engineering you did. Welcome to the forum!

Ric_2004 · September 15, 2024, 12:28am

This is one of the point in the user prompt. Apart from this, there are other instructions as well. but for this point, below point has all the instructions.

\n5. [text] is an array that contains multiple strings.\n6. Each String is uniquely identified by a delimeter like #INDEX_n# where n is the array index. IMPORTANT: YOU MUST NOT CREATIVELY REWRITE OR MODIFY OR REMOVE THESE INDEX DELIMETERS.\n7. Each string followed by must be creatively rewritten separately and should be stitched back in the output array in the SAME ORDER as the input. \n8. Generate the content in a similar Array as provided in input ARRAY [text].\n9. Do not include any message or explanation in the output except for the creatively rewritten strings.\n10. DO NOT add newlines or any new information in the response if is not there in [text]. Keep the escape characters as it is.\n11. IMPORTANT: You MUST NOT COMBINE OR SPLIT multiple strings output in one index in the output array. REMEMBER: The output must be an array containing the same number of strings as in the [text] JSON array and in the same order as the corresponding source strings. DO NOT return the prompt in any case.\n12. IMPORTANT: ONCE THE OUTPUT IS GENERATED, YOU MUST ENSURE THAT IT IS A VALID JSON PROGRAMATICALLY.\n\n"

    prompt += "[text] =" + indexed_text_json + "\n[target_locale] =" + locale + ". "

system prompt looks something like below:
sys_prompt = “You are a helpful linguist tasked with creatively rewriting text from the provided JSON array named [text] into the target locale : " + locale +”.

For small arrays (~10 len), it works fine but with big arrays, it combines few indexes as per context. we want to enforce a strict check on this, such that each string is translated individually and returned as per source array in the same order.
Thanks for checking @allyssonallan

allyssonallan · September 15, 2024, 1:46am

@Ric_2004 try to call the API with a temp like 0

import openai

openai.api_key = 'YOUR_API_KEY'

def translate_text(prompt):
    response = openai.ChatCompletion.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a helpful linguist..."},
            {"role": "user", "content": prompt}
        ],
        temperature=0,
        max_tokens=1000,
        stop=None
    )
    return response.choices[0].message['content']

# Example 
translated_text = translate_text(prompt)

A validation strategy plus chunking might fits:

import json
import re

def validate_output(original, translated):
    original_indexes = [match.group(1) for match in re.finditer(r'#AT_(\d+)#', original)]
    translated_indexes = [match.group(1) for match in re.finditer(r'#AT_(\d+)#', translated)]
    
    missing = set(original_indexes) - set(translated_indexes)
    extra = set(translated_indexes) - set(original_indexes)
    
    if missing:
        print(f"Missing indexes: {missing}")
    if extra:
        print(f"Extra indexes: {extra}")
    
    return not missing and not extra

# Example 
if not validate_output(original_text, final_text):
    # Handle inconsistencies
    pass

The chunking you can use with a json.dumps() strategy.

allyssonallan · September 15, 2024, 1:57am

Please, check it out:

Ric_2004 · September 15, 2024, 2:06am

The issue is that we don’t want any mismatches… and it does not return all indexes. It just put i+1 in ith index which changes the string altogether in that particular index

nicholishen · September 15, 2024, 2:16am

Have you tried running your use case by GPT and asking it to write you a parser?

Topic		Replies	Views
Fine Tuning GPT-3 for Consistent Output Format Prompting	11	6886	December 20, 2023
ChatGPT answers partially to request API chatgpt	6	204	February 20, 2025
Finding string indices Prompting	4	1268	September 14, 2024
GPT Struggles to Respond with Same Number of Translations I Give It API gpt-4	14	1809	December 24, 2023
I don't get the full result no matter what I do API api	6	3171	September 1, 2023

Gpt4o not returning all indexes of the array despite prompting in various ways

Related topics