/mnt/data/ path hallucination

Using the GPT4-turbo preview api (both assistant and direct chat completion), we noticed a quite common issue due to hallucination of the model.

If you try to ask the model to perform a task, a writing in a path, for example /tmp/xxx, the model will tend to hallucinate and will propose in output the path mnt/data/{randomfile}.

I guess it’s a bias linked to its setup to handle files uploaded in ChatGPT, but this should not have an impact on the API.

1 Like

The only way for the assistants or ‘direct’(?) API to work with files is if Code Interpreter is enabled.
Code Interpreter is is available via API only when added as a tool to a assistant.
Since it seems like you are trying to do this somehow differently, the model can’t comply with the request but produces a hallucination.

Am I reading this correct?

1 Like

Are you trying to have it write to an external path (outside the model/code interpreter tool environment - i.e., mnt/data/)? If so, I believe this is only possible via the Functions tool.

If you mean a path within mnt/data/, it could be that the session expired/reset (there is an hourly state reset of the code interpreter environment).

Please provide more details.

It’s the same model, and there’s definitely a bias, the best way to get around this is to simply create the path /mnt/data on your machine.

One can take away the AI’s ability to write this path, with the use of logit_bias.

for example, if the code has “\mnt\data”, “mnt” is encoded alone as its own token after \ is produced. Token 41982.

You can do the opposite task that instructs that save location, and then get logprobs, and just keep banning tokens until the AI can’t write the path in any variation.


Sorry : “direct” = chat completion. Edited in my original message.
Code interpreter is not enabled in this case.

Let me re-explain :
you tell the model “here is a path /tmp/xxx, write a command”, ls for example.
the model will reply ls /mnt/data/randomstuff instead of ls /tmp/xxx

Seems a setup done on the fine-tuning of the model for the code-interpreter usage of chatgpt, and very hard to countermeasure, even thou prompting that this path is to be excluded.

1 Like

Good catch, a logit bias increase on the correct string could do the job.

1 Like

There are multiple ways to do it. I can definitely recommend trying out everything to figure out what works best for you. For my application, adding “Use user-specified paths exactly as provided” was enough to get it back on track, but you could also fool the model into thinking it’s looking at a command terminal by changing /tmp/xxx to $/tmp/xxx.

The bias looks very robust, we tried prompt engineering it a lot on the system prompt and also on the message itself. But it keeps happening a fair amount of time.
The logit bias looks like a more durable solution, I will experiment with this and see.

1 Like