Hi all
I have a document, which is a user manual. This manual contains a few keywords/acronyms and it’s description in the acronym table. I am trying to summarize and split the document into a few parts using GPT-4. Along with that, I need the keywords that are used in each part to be copied as they are from(with no change in the description) the acronym table. GPT is able to summarize and split perfectly, but it is not able to copy the content from the acronym table. Copying from the acronym table is mandatory because text description contains some special characters which GPT can’t generate.
There might be something you could do with your prompt to make it a bit more likely to reproduce text verbatim. Would you be able to share the relevant portion of your prompt? Someone here might be able to help.
However, it must be noted that verbatim output is a persistent problem that many customers have run into, which greatly reduces the utility of the models for this type of work.
I’ve tried something similar to your solution. My problem is that whenever I input text with tags, ChatGPT does not copy over ALL the tags. So sometimes there would be tags randomly missing like specifically ID-48 is a problem for me, which completely messes up my input.
How can I enforce ChatGPT to NEVER delete “ID tags”? I’ve tried prompting it in multiple ways but it just seems to ignore me:
## SECTION TAGS
Before all and everything, the number one priority is this: PRESERVE "SECTION" TAGS. Whenever you see "SECTION", copy it over before moving on. THEY ARE IMMUTABLE. These are essential for the quality of the transcript. Preserve their location too.
Yes I use “SECTION” instead of “ID” but it’s the same principle(, no?)
Short answer is there is no way to really guarantee this. You can get it close to perfect using API, but using ChatGPT is a lot more beyond your control (there are multiple tools/systems called behind the scenes, the model parameters like temperature are not configurable, etc).
So my advice (unfortunately) is to develop your own solutions using API and perhaps some more deterministic methods as well (e.g. regex for ID tags).