Language of System prompt influences the output?

Reimagine · November 7, 2024, 1:57pm

Hey guys,

I was wondering if the language used in the system prompt influences the quality of the output in that same language ?

I couldn’t find anything online on the subject. Basically, the goal is to respond in Thai (which LLMs have a harder time responding in), and we were wondering if having the prompt in Thai would help with the responses ?

Thanks in advance.

dignity_for_all · November 7, 2024, 2:22pm

I am a native Japanese speaker, and I sometimes use the API to create prompts in English, translating between Japanese and English, and then refining the translations by providing feedback through additional API responses.

In my experience, nuances between different languages seem to be somewhat lost.

To keep token usage low, I write the system message in English. However, I feel that feedback provided on Japanese translations received in English tends to be somewhat influenced by the English phrasing, compared to feedback on Japanese translations given in Japanese.

So, if you’re looking for responses in Thai, I believe that crafting the prompt in Thai itself might yield higher-quality responses.

polepole · November 7, 2024, 2:39pm

Hi @Reimagine

Welcome to the community!

Supporting answer of @dignity_for_all, let me add my experience:

ChatGPT usually does best in English since it’s trained with a lot of English content. For languages like Thai, it can sometimes miss cultural details, like knowing when to use the right phrase, idiom or tone. But there are some ways to help with this similar to what I did for Turkish.

In my case, I added a file with 13,000 Turkish idioms, phrases, proverbs and their meanings, and this really improved ChatGPT’s understanding. Before I did this, it would sometimes get things wrong. For example, when it should have used an idiom to show respect or encouragement, it would accidentally sound like it was joking or not taking the person seriously. This happened because it didn’t fully get the context and tone.

After adding the idioms, phrases, and proverbs, ChatGPT’s responses in Turkish became much better, I can say It was very close to best. I believe this approach could also work well for Thai:

By adding Thai idioms and their meanings, ChatGPT would have a better idea of how to use these expressions correctly.

With this data, the model could check the right meanings, so it wouldn’t make mistakes in tone or context.

With these extra details, ChatGPT could respond in a way that sounds more natural and respectful in Thai.

Reimagine · November 7, 2024, 3:45pm

Thanks for the responses @dignity_for_all @polepole, based on your feedback, a combination of your implementations seem the best way to go. @polepole I guess you gave the content of the file as extra context to the LLM ? Which makes sense that it would improve the results when you think about it, at the cost of some extra tokens.

polepole · November 7, 2024, 4:02pm

I uploaded a JSON file as knowledge base.

This is a sample from it. It has almost 13,000 thousands phrase, idioms, or proverbs.

Topic		Replies	Views
Prompt Language English or == Response Language Prompting gpt-4 , api , assistants-api	2	905	June 21, 2024
Should I write the prompt in English or in Spanish? Prompting gpt-4	11	4898	February 28, 2024
Prompt in English, Response in non-English API	6	1711	April 28, 2024
Why does prompting that combine languages work? Prompting api	3	1846	September 13, 2023
Does language used in prompt matter? Prompting prompt-engineering	4	2158	November 8, 2024

Language of System prompt influences the output?

Related topics