Minor formatting of query makes significant difference to results in completions

flemming.madsen · August 24, 2023, 7:40pm

I have observed a significant difference in results when indenting some lines in a prompt. The differences are reproducible in the playground. The full text of the prompt and all settings remain the same, but adding few white spaces significantly changes the result.

In the following query, I have inserted asterisks (*) at the start of some lines.
If I remove the *s so there is no indention, the completion produces the result:

“Yes, we offer customization options for our products. Our approach to customizing products is to work closely with our customers to understand their specific requirements and needs. We then collaborate with our design and engineering teams to develop tailored solutions that meet those requirements. This may involve adapting the size, functionalities, and integration capabilities of our products. Our goal is to ensure that our customers receive a product that is fully customized to their unique needs.”

If I replace the *s with a space, the response is:

“Our approach to customizing products is to work closely with our customers to understand their specific requirements and needs. We offer a range of customization options, including adaptations in size, functionalities, and integration capabilities. By collaborating with our customers, we ensure that the customized product meets their unique needs and delivers the desired outcomes.”

The second response is much better than the first one, but the only difference is that some lines are indented.

Does anyone know why this is the case?
Where can I learn more about what impact indenting elements of the prompt will have? How are these indentions interpreted?

Update: I should mention that this difference in formatting only has an impact when using GPT-3.5-Turbo-xx and not when using GPT-4, where the responses are the same and very good every time.

_j · August 24, 2023, 8:13pm

There is no set rule about how the AI acts on different structures. There will be many different types of training that originate from various inputs that were used in the past, along with the AI’s understanding of language from its intake of various sources such as books, articles, and wikipedia posts (along with various amounts of formatting stripping), so part of prompting is just iteration to see what works best.

“Indentation” (as you see it) is instead using tokens that begin with a space. A word like book can have many different tokens in the BPE dictionary. Below each new token is given a different color:

The way that dictionary encoding words favors and puts spaces at the start of tokens instead of at the end or as separate entities.

The rarely-seen single token “-book” may have less and different semantic meaning.

anon22939549 · August 25, 2023, 1:46am

It’s based on online human communication. The short answer is, well-written, well-formatted questions which are easier to read and understand tend to receive better quality answers.

If I get some time later I’ll try to track down (or generate) an example to illustrate the point.

Topic		Replies	Views
Do characters in prompt matter? Prompting	8	1409	January 4, 2024
Extra space in system prompt seem to significantly affect output API	3	374	July 2, 2024
Should prompts be unique for fine-tuning? Prompting	9	1724	December 25, 2023
I'm finding that I get better results when I've very prescriptive Prompting ordering-of-prompt , prompt , prompts-as-code , prompt-engineering	5	1563	October 6, 2023
Inconsistencies in API response to same prompt and similar content API gpt-4 , gpt-35-turbo , api	3	5137	July 18, 2023

Minor formatting of query makes significant difference to results in completions

Related topics