Overcoming AI Response Issues: Unwanted Codes in Text -【59†source】

Greetings to the respected OpenAI team and community :wave:

I am attempting to create an AI assistant for the first time with a loaded knowledge base. I’ve uploaded text and PDF materials totaling 150,000 characters. Included in these files are some legal articles and FAQs which partly contain links to the relevant resources and documents.

Now, when the AI assistant responds to my queries, for some reason, it tends to add the code 【59†source】 at the end of almost every paragraph. Each such code may contain different numbers. This is particularly problematic when at the end of the response, the bot adds a link and without a space, attaches the aforementioned code, thus altering the link address and rendering it incorrect. What does this mean? Is there any way I can prevent the bot from behaving like this? I’ve tried writing in the system prompt do not include such designations 【59†source】 in the responses - it doesn’t help :disappointed:

Hi!

people have been struggling with this for a while

but they’re actually proper file annotations. Here’s the docs: https://platform.openai.com/docs/assistants/how-it-works/message-annotations

here’s an actual openai response on the matter: How do the Assistants / GPTs sources relate to the file names? - #2 by jeffchan

Learned something today!

So TLDR: works as intended :rofl:

1 Like

@Diet I am very grateful that you shared this useful information :pray: I will definitely study these posts. :face_with_monocle: I’ve been trying to find similar information for a while and suspected that it must have been discussed somewhere. :sweat_smile:

1 Like

Do you know how I found out?

I saw these posts multiple times and couldn’t help people because I don’t like assistants.

But I decided to take a look because I was curious, and ask chatgpt if it could recognize the format.

Initially I didn’t notice it, but then I hovered over it:

so I thought damn, that’s intentional! They trained the models to behave like that!

So then I thought I should try to figure out how the citations are actually generated

So I was looking at the source and inspecting packets trying to figure out where the translation to html was happening!

But then I caught myself - if it’s confirmed to be intentional, there’s gotta be some documentation around somewhere. Looking for “source” in the search bar does yield some results, but might have to wade through some other developers’ despair.


1 Like

My problem is that I don’t understand English and I use Google translator, and besides, I’m not a programmer :disappointed:
I tried to study the links provided on my own, but I couldn’t figure out how to force the AI assistant not to add such annotations in the response message. However, I would like him to indicate the numbers of paragraphs (sections) when quoting a specific paragraph (For example, "According to the requirements of section (clause) 1..."). But no matter how much I write in the system prompt so that he points out the numbers for certain quotes, he does not listen to me. If I ask him in a conversation to provide the numbers of the points that he quoted, he will do it. Thus, double work is done: 1) First he writes the quotation 2) Then I ask him to indicate the paragraph number of the quotation 3) He rewrites the same quotation, only with added paragraph numbers. As a result, double waste of tokens :disappointed: