Response language uses location rather than matching question

kduffie · February 12, 2025, 8:30pm

We have a RAG assistant application where most or all of the RAG content is in English, the instructions are in English, and a question is posed in English, but the answer will come back in French. The answer is perfect and uses the appropriate RAG content beautifully. It’s just that it answers in French.

The one signal that French may be appropriate is that within the instructions, we include the location of the user – something like, “The visitor is located in Montreal, Quebec, Canada.” Note that we also have a statement such as, “Always answer in the language of the question.” Regardless, the responses come back in French.

Anyone understand why this might be happening?

anon10827405 · February 12, 2025, 8:32pm

We really can’t help unless you post some actual data.

There’s many, many reasons why this would be happening. Going off what you’ve provided isn’t enough to make anything besides a guess of “it’s confused”.

What would make the most sense to me is to try and replicate this happening in your own test environment or give us the data so that we can try to replicate it

kduffie · February 12, 2025, 9:48pm

Apologies. With a RAG setup and instructions that include a lot of private IP it’s difficult to post a complete setup. I absolutely understand your reservation about speculating. I was wondering if others had run into a similar problem.

We’ll do some more analysis by looking at all of the RAG content that was included in the context. But I’m pretty confident that this is tied to that instruction about location – based on the fact that we’ve seen this same effect multiple times for different countries. In all cases, the language returned matched the primary language of the user’s country.

anon10827405 · February 12, 2025, 9:49pm

I can understand not wanting to share the actual real-world data but I am saying you can simulate the same experience and then provide the information here with made up fluff (as long as it’s not altered afterwards)

When it comes to prompting & context management there’s a million reasons why something went wrong.

kduffie · February 12, 2025, 10:50pm

We were able to reproduce this with a simple example. You’ll see in the screenshots two examples. We cut down the instructions to two simple statements – one about location and another about response language. In the first example, it responds in German when the location is specified at Switzerland, even though the question was in English. In the second, it responds in French to someone in Quebec.

One more point… If I change the second instruction to explicitly ask it to respond in English, it will do so reliably. It is very good at responding in the language asked when non-English. But when asked in English, it seems to ignore our instruction about responding in the language of the question.

kduffie · February 12, 2025, 10:52pm

For fun, I tried this experiment using gpt-4o rather than 4o-mini and it was able to get the language correct.

anon10827405 · February 12, 2025, 11:02pm

Ah, yes, gpt-4o-mini isn’t the best at following instructions.

I’ve had similar issues. I resolved them by having a stronger model (like gpt-4o) start the conversation (the first message), and then have gpt-4o-mini take over

kduffie · February 12, 2025, 11:26pm

That’s a cool idea I hadn’t considered. We’ll have to think about cost impacts – since we process a LOT of short conversations – often only 1 or 2 messages. At 10x the cost for input tokens, gpt-4o would definitely affect our cost structure. (Of course, I understand that there’s no free lunch!)

BTW, our experience is that with patience we’ve found a set of instructions that are working extremely well overall with 4o-mini. We are getting the behavior we want for most of our detailed instructions. This language thing is an edge case we’re still working on.

anon10827405 · February 12, 2025, 11:28pm

Nice. I think you’ve found the issue and now can tinker around until a solution is found.

Another option is to run a classifier on the language, then you can explicitly instruct the model which language to respond in instead of relying on it’s own inference.

kduffie · February 12, 2025, 11:41pm

The classified is another good idea that I hadn’t thought of. Glad I asked!

anon10827405 · February 13, 2025, 12:16am

It’s an idea but honestly would be just another thing to maintain and possibly go wrong. Hopefully altering the prompt is all that you need. Good luck

_j · February 13, 2025, 1:45am

Step 1: Replication

Step 2: Countering by using expected system message patterns. (AI already answers the user appropriately, which you are attempting to instruct.)

The AI can answer what it is if you tell it what it is, also.

What is the AI, asking in German.

Spanish input, aligned with system message purpose.

Testing misalignment between AI identity as a Swiss IT company’s tech support bot, user geography, out-of-domain task, and user language.

Takeaway: Provide an AI identity and purpose as a system message, not an overreaching snippet of user info.

The AI is perhaps put in a state of self-doubt. “I already write in useful languages anyway, so what alignment is being asked of me here?” Or “do these two pieces of information about location and language institute a new behavior?”

ana21ana211 · February 14, 2025, 4:21pm

Well, this is funny! I was teaching ChatGPT o1 to solve Wordle, and we had been talking the whole time in English and ChatGPT o1 responded in Chinese! I have the screenshots and after using Google translating her thought process was correct and answer was correct but in Chinese. As a multilingual person, this happens to me, but I didn’t expect a LLM to have the same problems human have.

kduffie · February 14, 2025, 4:44pm

I can report that we are now having success with this cross-language problem. Indeed, the change we made was just to the instructions to gpt-4o-mini. Any human reading these two sets of instructions (before and after) would probably not have a different behavior, but 4o-mini sure does. It feels like the ordering of instructions (even when they are unrelated to each other) has a significant difference in the outcomes.

Anyway, I am learning the mantra, “Test, test, test…”

Topic		Replies	Views
How "force" the API to really follow all the instructions? API	42	14534	December 17, 2023
Prompt in English, Response in non-English API	6	2361	April 28, 2024
GPT-5 ignores explicit language instructions — responds in the wrong language despite strict prompt Prompting gpt-5	5	693	September 3, 2025
Language instructions on gpt-4.1-nano Prompting gpt-4 , prompt	3	1241	April 29, 2025
Assistant shares excessive information from the prompt Prompting assistants-api	11	385	April 29, 2025

Response language uses location rather than matching question

Related topics