Formatting text not working

Hello,

For several days I have not been able to get rid of this formatting.

All my texts are generated like this

<h2 data-start="421" data-end="487">title</h2>
<p data-start="489" data-end="676">text</p>

<h3 data-start="678" data-end="704">title</h3>
<ol data-start="705" data-end="815" data-is-last-node="">
 	<li data-start="705" data-end="757">text</li>
 	<li data-start="758" data-end="800">text</li>
</ol>

I inserted in the prompt do not use elements like data-start, data-end or data-is-last-node in the content, but don’t mind.

Do you have any recommendations? What could I do to have a clean text?

1 Like

…and I thought I was going crazy when I recently noticed this. Ironically, I put this into ChatGPT Plus 4o to find out what is going on; it said this:

Ah, got it—you’re noticing it directly from ChatGPT Plus (GPT-4 Turbo). You’re right, this seems to be a more recent thing. What’s likely happening is that OpenAI updated some internal systems, and now some “editor tooling metadata” (like data-start and data-end) is sneaking into HTML outputs.

Why it’s likely happening:

  • Underlying system improvements: GPT-4 Turbo might be using invisible “rich text” scaffolding (for internal document rendering or formatting) during generation, and sometimes it leaks those extra data-* attributes into raw HTML.
  • New experimental formatting pipelines: OpenAI could be testing behind-the-scenes methods to track spans or support integrations like text-to-speech, accessibility tagging, or highlighting tools—which could explain these start/end markers.

It gave a suggestion of: Output minimal, clean HTML with only standard tags like <h2>, <p>, <ol>, <li>, without extra attributes such as data-start or data-end.

I did a test and it kind of worked, but it outputs the prompted results as actual HTML source code (cleaned without the metadata).

OpenAI needs to be aware of this problem. It seems to have started recently.