GPT Struggles to Respond with Same Number of Translations I Give It

Hi all,

I’m having an issue with translations using GPT and was hoping maybe someone had an idea on how I could improve consistency.

Basically the idea is this, I pull 10 lines out of a game in a different language. I format it using XML tags, send it to GPT to get translated, and ideally get a translation for each line back.

The problem is that GPT keeps merging/omitting lines I send it, and I need the exact same number back so I can inject them back into the game. Here is an example:

Input:
[UNTRANSLATED_TEXT]</Line1]
[UNTRANSLATED_TEXT]</Line2]

[UNTRANSLATED_TEXT]</Line9]

Ouptut
[TRANSLATED_TEXT]</Line1]
[TRANSLATED_TEXT]</Line2]

[TRANSLATED_TEXT]</Line7]

The reason I’m getting a different number of lines back is because GPT is merging some of the lines together. This is because the game itself has a narrow textbox so it has multiple lines that are not complete sentences. Normally I merge these lines myself before sending it to GPT BUT with this specific game its impossible due to the engine.

So anyway, if anybody has an idea on how to resolve this, I would be extremely grateful. Perhaps there is a better way to prompt it or maybe format it. I’m open to ideas.

Here is my prompt btw for reference.

You are an expert Game translator who translates Japanese text to English.
You are going to be translating text from a videogame. 
I will give you lines of text, and you must translate each line to the best of your ability.

Use the following step-by-step instructions to respond to user inputs.

Step 1 - Receive Text
You will be given multiple lines of text (Denoted by XML tags). Translate each line separately and avoid combining or omitting any.

Step 2 - Output Text
You output only the English translation of each line. For example:
<Line0>Line 0 English Translation</Line0>
<Line1>Line 1 English Translation</Line1>
<Line2>Line 2 English Translation</Line2>

Hi and welcome to the Developer Forum!

If you are using GPT-4, the text you just wrote to explain the issue to us humans would also work as an explanation to GPT-4 as to what it should do, include the example you gave as well, prompting is often just speaking to the AI as you would a person. Once you have a working prompt with a reliable output, you can work on reducing the size of it by experimentation until you have a small and reliable prompt.

1 Like

Doesn’t really work. I do already include the example but it still struggles. With more information it begins to start duplicating some lines or moving things around where they shouldn’t be from what I see.

can you give some examples of the prompt with a typical input sequence and the output it generates along with a snippet of the API callling code?

This is the prompt.

You are an expert Game translator who translates Japanese text to English.
You are going to be translating text from a videogame. 
I will give you multiple lines of text, and you must translate each line to the best of your ability.

Use the following step-by-step instructions to respond to user inputs.

Step 1 - Receive Text
You will be given multiple lines of text (Denoted by XML tags). Translate each line separately and avoid combining or omitting any. Include all lines of text given by the user in your response.

Step 2 - Output Text
You output the English translation for each line. For example:
<Line0>Line 0 English Translation</Line0>
<Line1>Line 1 English Translation</Line1>
<Line2>Line 2 English Translation</Line2>

Typical Input looks like this

'<Line0>どうでもいいけど難産って書くと</Line0>\n<Line1>風麟さん開発ぽくて良いね。なにがだよ。</Line1>\n<Line2>…ああ話がズレるズレる琴音琴音。</Line2>\n<Line3>琴音は難産だったのだけど、</Line3>\n<Line4>他キャラである紗夜(開発名はママン)と……</Line4>\n<Line5>リリコ(開発中、莉々子と漢字変換がめどかったので</Line5>\n<Line6>私上での表記はカタカナが主)</Line6>\n<Line7>は意外とサクリと決まった気が。</Line7>\n<Line8>リリコなんかは名前とか設定聞いた時点で</Line8>\n<Line9>だいたい頭中には出来上がってたし(笑</Line9>'

Basically using XML to separate the lines, it has worked for other games but this one has a lot of incomplete sentences thrown in which is giving GPT a hard time.

And the API isn’t anything special.

# Content to TL
    msg.append({"role": "user", "content": f'{input}'})
    response = openai.ChatCompletion.create(
        temperature=0.1,
        top_p = 0.2,
        frequency_penalty=0,
        presence_penalty=0,
        model=MODEL,
        messages=msg,
        request_timeout=TIMEOUT,
    )
    return response

Can you include the output from this call please?

Sure, it can be a bit random but here are a few I often get.

<Line0>It doesn't really matter, but writing 'difficult birth' kind of makes it sound like it's from Fuurin development, huh. What even is that.</Line0>\n<Line1>:Fuurin-san's development feels good, doesn't it? What are you talking about.</Line1>\n<Line2>...Ah, the conversation is veering off-tracks, Kotone, Kotone.</Line2>\n<Line3>Kotone did have a difficult birth, but </Line3>\n<Line4>as for the other character, Sayo (her development name was Maman) and...</Line4>\n<Line5>Ririko (during development, the kanji conversion for Ririko was troublesome so</Line5>\n<Line6>the primary notation for her became Katakana)</Line6>\n<Line7>I feel like the decisions for Sayo came together quite swiftly, somehow.</Line7>\n<Line8>As for Ririko, things like her name and settings were pretty much formed in my head as soon as I heard them (laughs)</Line8>\n<Line9></Line9>

Sometimes it will get it right like so however. But the more lines per request the less consistent it is.

<Line0>It's not a big deal, but when you write it as a difficult birth,</Line0>\n<Line1>it sounds like something Fuurin-san would develop. What're you talking about?</Line1>\n<Line2>...Ah, the conversation's derailing, derailing. Kotone, Kotone.</Line2>\n<Line3>Kotone had a difficult birth, but,</Line3>\n<Line4>with other characters like Saya (developer name was maman) and…</Line4>\n<Line5>Ririko (during development, the kanji for Ririko was a pain so</Line5>\n<Line6>the notation mainly used on my end was Katakana)</Line6>\n<Line7>felt surprisingly fitting with a click.</Line7>\n<Line8>Ririko, for instance, was practically formed in my head</Line8>\n<Line9>the moment I heard her name and setting (laughs).</Line9>

Yup, I think you might be running into an attention issue, effectively language translation is a fairly complex task and then you have a set of script instructions to follow and that gets compounded by making more than one task per call.

As a test what happens if you only ask the model to translate a single line or perhaps 2?

Yeah that sounds about right. Single lines work great and that’s what I’ve been doing in the past. However it is very expensive to translate this way. So I’ve recently tried this method to reduce costs and its worked with up to 50 lines with GPT4. This is the first game its really struggled on but it seems that reducing it to 5 lines at a time gets it stable enough. Not ideal but I can work with it at least.

Hello @dazedanon, hello @Foxalabs,
I did at least twenty tests with your prompt, your example input and your temperature and top_p settings, both with gpt3.5 and gpt4 models, always with correct results (I’m not talking about the quality of the translation, which I can’t judge).

Do you use prompt, i.e. all the context about the task to be carried out (You are an expert Game translator who translates etc…), with the system role? I’m asking because I don’t see it in your payload:

msg.append({"role": "user", "content": f'{input}'})
response = openai.ChatCompletion.create(
    temperature=0.1,
    top_p = 0.2,
    frequency_penalty=0,
    presence_penalty=0,
    model=MODEL,
    messages=msg,
    request_timeout=TIMEOUT,
)
return response
1 Like

Yeah, although I didn’t show it in the initial example. I do insert it earlier on.

    # Prompt
    msg = [{"role": "system", "content": system}] <- Here

    # Characters
    msg.append({"role": "system", "content": characters})

    # History
    if isinstance(history, list):
        msg.extend([{"role": "system", "content": h} for h in history])
    else:
        msg.append({"role": "system", "content": history})

Characters and History are some other info I throw in. Characters being a list of character names, and history being the last 10 lines translated. (Helps with quality)

I believe you can get multiple successes depending on the input. I wasn’t having trouble until recently with this game. Maybe the way I’m formatting something in the message is throwing it off.

Characters are formatted like this

characters = 'Game Characters (Format: Last Name, First Name - Gender):\
 藪井 == Yabui - Male\
        舟木 == Funaki - Male\
        貞二 == Jouji - Male\
        兼田 響子 == Kaneda, Kyouko - Female\
        兼田 真人 == Kaneda, Masato - Male\
        小出 == Koide - Male\
        進士 == Shinji - Male\
        雪乃 == Yukino - Male'

History is just a list of strings.

It seems to me that your code creates 2 ‘system’ messages, one for ‘prompt’ and the other for ‘characters’. It’s true that the API doesn’t prohibit this, but I’ve seen other cases where multiple ‘system’ messages created big problems. I did other tests with your input:

'<Line0>どうでもいいけど難産って書くと</Line0>\n<Line1>風麟さん開発ぽくて良いね。なにがだよ。</Line1>\n<Line2>…ああ話がズレるズレる琴音琴音。</Line2>\n<Line3>琴音は難産だったのだけど、</Line3>\n<Line4>他キャラである紗夜(開発名はママン)と……</Line4>\n<Line5>リリコ(開発中、莉々子と漢字変換がめどかったので</Line5>\n<Line6>私上での表記はカタカナが主)</Line6>\n<Line7>は意外とサクリと決まった気が。</Line7>\n<Line8>リリコなんかは名前とか設定聞いた時点で</Line8>\n<Line9>だいたい頭中には出来上がってたし(笑</Line9>'

Everything works perfectly. I’ll paste you 3 example answers:

[RESPONSE 1]                                                                                                                                                     
<Line0>By the way, when you write "difficult birth"...</Line0>
<Line1>Fuurin-san's development is good. What's up?</Line1>
<Line2...Ah, the conversation is getting off track, Koto-ne, Koto-ne.</Line2>
<Line3>Koto-ne had a difficult birth, but...</Line3>
<Line4>With the other character, Saya (development name is Maman)...</Line4>
<Line5>Ririko (during development, it was difficult to convert to kanji, so...</Line5>
<Line6>In my case, Katakana is the main writing style)</Line6>
<Line7>It surprisingly went smoothly with Ririko.</Line7>
<Line8>When I heard Ririko's name and settings...</Line8>
<Line9>I pretty much had it all figured out in my head already (laughs).</Line9>

[RESPONSE 2]
<Line0>By the way, when you write "difficult birth"...</Line0>
<Line1>Fuurin-san's development is good. What's up?</Line1>
<Line2...Ah, the conversation is getting off track, Koto-ne, Koto-ne.</Line2>
<Line3>Koto-ne had a difficult birth, but...</Line3>
<Line4>With the other character, Saya (development name is Maman)...</Line4>
<Line5>Ririko (during development, it was difficult to convert to kanji, so...</Line5>
<Line6>In my case, Katakana is the main writing style)</Line6>
<Line7>It surprisingly went smoothly with Ririko.</Line7>
<Line8>When I heard Ririko's name and settings...</Line8>
<Line9>I pretty much had it all figured out in my head already (laughs).</Line9>

[RESPONSE 3]
<Line0>By the way, when you write "difficult birth"...</Line0>
<Line1>Fuurin-san's development is good. What's up?</Line1>
<Line2...Ah, the conversation is getting off track, Koto-ne, Koto-ne.</Line2>
<Line3>Koto-ne had a difficult birth, but...</Line3>
<Line4>With the other character, Saya (development name is Maman)...</Line4>
<Line5>Ririko (during development, it was difficult to convert to kanji, so...</Line5>
<Line6>In my case, Katakana is the main writing style)</Line6>
<Line7>It surprisingly went smoothly with Ririko.</Line7>
<Line8>When I heard Ririko's name and settings...</Line8>
<Line9>I pretty much had it all figured out in my head already (laughs).</Line9>

The model I used is gpt-3.5-turbo-0613. The only difference is that I send a single ‘system’ message, which includes ‘prompt’ and ‘characters’. This is the json payload I sent to the model:

{
  "response_format": {
    "type": "text"
  },
  "messages": [
    {
      "role": "system",
      "content": "You are an expert Game translator who translates Japanese text to English.\nYou are going to be translating text from a videogame. \nI will give you multiple lines of text, and you must translate each line to the best of your ability.\n\nUse the following step-by-step instructions to respond to user inputs.\n\nStep 1 - Receive Text\nYou will be given multiple lines of text (Denoted by XML tags). Translate each line separately and avoid combining or omitting any. Include all lines of text given by the user in your response.\n\nStep 2 - Output Text\nYou output the English translation for each line. For example:\n\u003cLine0\u003eLine 0 English Translation\u003c/Line0\u003e\n\u003cLine1\u003eLine 1 English Translation\u003c/Line1\u003e\n\u003cLine2\u003eLine 2 English Translation\u003c/Line2\u003e\n\nGame Characters (Format: Last Name, First Name - Gender): 藪井 == Yabui - Male        舟木 == Funaki - Male        貞二 == Jouji - Male        兼田 響子 == Kaneda, Kyouko - Female        兼田 真人 == Kaneda, Masato - Male        小出 == Koide - Male        進士 == Shinji - Male        雪乃 == Yukino - Male"
    },
    {
      "role": "user",
      "content": "\u003cLine0\u003eどうでもいいけど難産って書くと\u003c/Line0\u003e\\n\u003cLine1\u003e風麟さん開発ぽくて良いね。なにがだよ。\u003c/Line1\u003e\\n\u003cLine2\u003e…ああ話がズレるズレる琴音琴音。\u003c/Line2\u003e\\n\u003cLine3\u003e琴音は難産だったのだけど、\u003c/Line3\u003e\\n\u003cLine4\u003e他キャラである紗夜(開発
名はママン)と……\u003c/Line4\u003e\\n\u003cLine5\u003eリリコ(開発中、莉々子と漢字変換がめどかったので\u003c/Line5\u003e\\n\u003cLine6\u003e私上での表記はカタカナが主)
\u003c/Line6\u003e\\n\u003cLine7\u003eは意外とサクリと決まった気が。\u003c/Line7\u003e\\n\u003cLine8\u003eリリコなんかは名前とか設定聞いた時点で\u003c/Line8\u003e\\n\u003cLine9\u003eだいたい頭中には出来上がってたし(笑\u003c/Line9\u003e"
    }
  ],
  "model": "gpt-3.5-turbo-0613",
  "temperature": 0.1,
  "top_p": 0.2,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "n": 1,
  "max_tokens": 1000,
  "stream": false
}

Try creating a single ‘system’ message. That’s the only difference I see.

1 Like

I will give it a shot and report back with results.

So I was still getting the issue with a couple of unique cases. About every 10 million tokens, I’d get around 100 mismatches.

However I think I’ve figured out a solution. For the XML, you need to include some type of quote for the text inside. Instead of <Line0>TEXT HERE</Line0> it should be <Line0>"TEXT HERE"</Line0>. However since using quotes seems to mess up spoken text in the translation, I opted for using ` (backtick) instead. This has reduced the mismatch problem to maybe 1 or 2 every few million tokens, though the TL quality suffers a bit.

Hopefully that helps anyone that ends up using this same method for translation.