How do I get rid of blank space characters in prompts?

lostinsauce · October 20, 2023, 11:20pm

Apologies up front if this is a more general programming question.

I’m using the API with Python, and here is an example of one of my user message prompts with somewhat standard Python indentation (1 tab indent per line, looks like more here):

completion = openai.ChatCompletion.create(
                model=gpt_model,
                temperature=0.7,
                logit_bias=bias_words,
                    messages=[{"role": "user", "content": f"""
                                    Write '<h3>Title:</h3>' in title case. 
                                    Under that heading, write a 60-character SEO-optimized title for {article_title}. Write five different ones for {article_title}.
                                    Examples: 
                                    The Best Running Shoes of 2023 (Comfortable & Stylish!)
                                    Classic Truffle Pasta (Super Easy, 30-Minute Vegan Recipe)
                                    Top 10 Easiest Plants for a Backyard Garden
                                    """
                                    }]
            )
            return completion.choices[0].message.content

The issue is, when I submit this as a prompt, there are 26 empty spaces after each line of text that are counting towards the total token count.

I don’t want to jam all the prompt text onto one line because it has a tendency to ruin the format of the output. Is there a better way to go about formatting my prompts or something else I can do to get it to stop submitting blank spaces to the prompt after each line?

Thanks

_j · October 20, 2023, 11:42pm

With a docstring, all spaces that are placed between the triple-quotes are included, every bit of indentation you show is passed.

You can use the .strip() method on the string to simply and only remove whitespace before and after. This allows you to maintain a clear presentation where all the text is in one readable block.

multi_line = """

This is the text.
I also write a second line

The end is clear and separated from code also.

""".strip()

strip() operates on the linefeeds and spaces just before and after the contents. My string thus starts exactly at the word “This…”.

You do not and cannot indent the text within the docstring though - spaces will appear in the output.

Within the parenthesis, indentation is arbitrary, but should be readable, pythonic. However, you’ve made a fatal mistake, the “return” line is the next line after completion, and also must go back to root indentation; the indentation of parenthesis contents didn’t indent where the next line is expected.

I fix your code, fix the string, then run it by run it through the Black formatter to clean up the indentation just for readability.

completion = openai.ChatCompletion.create(
    model=gpt_model,
    temperature=0.7,
    logit_bias=bias_words,
    messages=[
        {
            "role": "user",
            "content": 
f"""
Write '<h3>Title:</h3>' in title case. 
Under that heading, write a 60-character SEO-optimized title for {article_title}. Write five different ones for {article_title}.
Examples: 
The Best Running Shoes of 2023 (Comfortable & Stylish!)
Classic Truffle Pasta (Super Easy, 30-Minute Vegan Recipe)
Top 10 Easiest Plants for a Backyard Garden
""".strip(),
        }
    ],
)
return completion.choices[0].message.content

If you like the look of indentation for readability, we can place the string in parenthesis, and then use implicit line continuation to join individual strings.

completion = openai.ChatCompletion.create(
    messages=[
        {
            "role": "user",
            "content": (
                "Write '<h3>Title:</h3>' in title case.\n"
                "Under that heading, write a 60-character SEO-optimized title for {article_title}.\n"
                "Write five different ones for {article_title}.\n"
                "Examples:\n"
                "The Best Running Shoes of 2023 (Comfortable & Stylish!)\n"
                "Classic Truffle Pasta (Super Easy, 30-Minute Vegan Recipe)\n"
                "Top 10 Easiest Plants for a Backyard Garden\n"
            ),
        }
    ],
)

anon10827405 · October 20, 2023, 11:45pm

Agree with @_j I just want to point one thing out.

This does not count towards your tokens (besides just the one for each line). It does influence the model, but probably very slightly.

Still. Definitely a better idea to not have it in. Who knows, it could confuse the model as long indentations are commonly associated with coding.

_j · October 20, 2023, 11:48pm

If you want stupid token tricks, you CAN indent by one space. This will ensure the tokens used are not “beginning of line” words, but “starts with a space” words, which appear much more in corpus, enhancing understanding.

lostinsauce · October 20, 2023, 11:54pm

Ahh that’s much cleaner, thanks @_j, I appreciate it!

_j · October 21, 2023, 12:03am

Oh, and in the final example, put the f-string’s “f” back on the line with the string that needs it!

Topic		Replies	Views
Extra space in system prompt seem to significantly affect output API	3	343	July 2, 2024
Question about function completion model tokenization API	3	423	July 12, 2023
Removing spaces from prompts to maximize character limits (i.e. in GPT config) Plugins / Actions builders gpt-4 , chatgpt	2	1791	March 15, 2024
Why is GPT API giving me a response with lots of spaces and new lines? API gpt-4 , chatgpt , api	9	3298	November 26, 2024
Token Optimization Question API	2	1354	May 11, 2023

How do I get rid of blank space characters in prompts?

Related topics