File Citations bug in file_search citations using gpt-5.5

For the past several days, the our documentation chatbot (powered by gpt-5.5 using the file_search tool) has not been correctly adding the citations in its responses. For example, here is a recent output message:

{‘id’: ‘msg_0699b1fab59d93a5006a27ebef660c819fb62be649c0a8b909’, ‘content’: [{‘annotations’: [], ‘text’: ‘By “assigned a service band”, I mean the employer organisation has a value selected in its Service Band field on their organisation profile.\n\nA Service Band is a way of categorising employers for fair pricing — for example:\n\n- Default / Standard\n- Charity\n- Public sector\n- Commercial\n- Partner employer\n\ntargetconnect uses the Service Band to decide which stand price applies to that organisation. The documentation says Service Band is used to define prices for event stands — for example, offering a lower price to charities. \ue200filecite\ue202turn2file4\ue201\n\n## Where to check it\n\nGo to the employer’s organisation record:\n\nOrganisation profile > Edit > Basic Information\n\nThere is a Service Band field there. The documentation describes this field as being used to categorise costings for employer fairs. \ue200filecite\ue202turn2file13\ue201\n\n## How it affects the stand price\n\nWhen you set up a stand, you can have different prices for different Service Bands. For example:\n\n| Stand | Default price | Charity price |\n|—|—:|—:|\n| Standard stand | £500 | £250 |\n\nIf the organisation’s Service Band is set to Charity, targetconnect can apply the Charity price. If the organisation does not have a service band, targetconnect uses the default cost instead. If the organisation has a service band but no price has been entered for that band, it also falls back to the default cost. \ue200filecite\ue202turn2file2\ue201\n\n## Why this could cause £0 to show\n\nIf your Default stand cost is currently 0, and the employer either:\n\n- has no Service Band set, or\n- has a Service Band that does not have a specific price entered,\n\nthen targetconnect may use the Default cost — which would show as 0.\n\nSo I’d check two things:\n\n1. On the employer organisation profile, check what Service Band they have.\n2. On the Employer stands setup, check the price entered for both:\n - the relevant Service Band, and\n - the Default price.’, ‘type’: ‘output_text’, ‘logprobs’: []}], ‘role’: ‘assistant’, ‘status’: ‘completed’, ‘type’: ‘message’, ‘phase’: ‘final_answer’}

As you can see, there are no ‘annotations’ being added, but the model is attempting to cite file using things like “\ue200filecite\ue202turn2file13\ue201”, etc. This happens when streaming the events and in the final output message.

We haven’t changed anything in the system prompt in the past 2 months, but this issue has only started appearing in the past few days.

Is anyone else having similar issues, or knows how to fix this?

Thanks for the report that has no ambiguity.

That’s how the AI-written citations generally should look:

image

It is not a completely faulty request like having two of the second parts in one container.

A few times, with new model releases and also on the same, OpenAI has revised their character scheme from oddball Unicode blocks that they instruct the model to create. The internal tool instructions for the AI must match what sequence the API strips out of the response and recognizes.

The API is not capturing the citation string if you are seeing it.

This is an API configuration bug for the model or for the whole endpoint if the AI is always writing the same citation characters, never parsed out, and the vector store and file search tool remains attached.

OpenAI staff must recognize, escalate, and fix the issue.

I can confirm this is an issue. Citations are appearing as raw marker text instead of proper annotation objects.

Will flag this to the team.

Yes, tired of this.

Same error, repeated always. Tried instruction and prompt harnessing, to no effect.
Edit: also the annotations array is coming empty…

Good afternoon,

Since yesterday, I’ve noticed unusual behavior when using the OpenAI Responses API in my project. Suddenly, the annotations array within the message objects returned by the Conversations > List items endpoint is empty, even though it should contain citations.

When creating the response, I include a vector store ID in the tools body property. That vector store contains multiple files, and the final response I receive is correctly based on those files, which is expected.

However, when I fetch the list of items from the conversation where the response was created using:

/conversations/<conversation_id>/items?limit=100&include%5B%5D=file_search_call.results

I get the complete list of objects with the types message, reasoning, and file_search_call, which also looks correct.

The issue is that the message object containing the final response has an empty content > annotations array. Instead, the citations are included directly in the output text, like this:

`fileciteturn0file15`

My understanding is that, when querying files, OpenAI models such as GPT-5 are instructed to emit internal markers like filecite turn1file1. The OpenAI API backend should detect, parse, and remove these markers, then expose them as file_citation annotations. If those markers appear in the raw assistant output, it suggests that the backend did not properly process and convert them.

This was not the way it was working few days ago. Could someone from the OpenAI Dev team please take a look at this?

This also happening with GPT-5-4, so it may be reproducible in other GPT-5 versions.

Exactly what i was able to reproduce on my side . The client was openai 4.104.0 , then i swapped for 6.42.0, to try again, but to no result. Tried using various parsers and i got 0 result. My searches here in the forum say this is a problem with the API integrated RAG. Giving that we pay the suscription, this is unsettling at least.
The message.anotations element, comes empty when using a model superior to 5.3 latest , both in the platform and in my api calls.