GPT-5 not quoting excerpts with quotation marks in the Response API

This just started a few days ago and it’s gotten worse:

Excerpts are no longer being quoted with quotation marks. This does NOT occur with an HTML response - only with text. I captured and inspected the entire JSON response to make sure our UI was rendering properly.

Example:

Correct Response: Effective teams can be difficult to describe because “high performance along one domain does not translate to high performance along another” (Ervin et al., 2018, p. 470).

Incorrect Response: Effective teams can be difficult to describe because high performance along one domain does not translate to high performance along another (Ervin et al., 2018, p. 470).

Note that the incorrect response excerpt is missing quotation marks - they are actually spaces.

While this appears to be a minor bug, our customers have noticed.

Your excerpts can show the spaces instead of quotes and will have the correct quote characters when you enclose within backticks (`).

Example:
This is `backtick “quoted” text with extra spaces `

Displays:
This is backtick "quoted" text with extra spaces

(and if you are markdown guru, you can markup your markdown to show what you use.)

```plaintext
# a code block
print("the actual  quotes")
```

Do you have any system message instructions that the AI is disobeying to reinforce this desired behavior, or do you just expect it?

Is this from a web search or a file search, and then using the citation/annotation mechanism?


Trixy trick: switch the conversation to a structured output. Give it a key "last_assistant_message_repeated_verbatim". Tell the AI to write the entirety of what it just said in that key. We already know that structured outputs will break the annotations - exploit them to get what the AI internally produced.

Do you have any system message instructions that the AI is disobeying to reinforce this desired behavior, or do you just expect it?

A while ago, I tried with an instruction prompt:

  • Ensure that excerpts are quoted with quotation marks.

But that didn’t work. I don’t think System messages are available for the GPT-5 API, right? Maybe I can try another prompt with an example.

Not using any tools. In all cases we are uploading PDFs to analyze. The excerpts are from the uploaded PDFs.

It’s odd that this not an issue with HTML responses - only an issue with UTF-8 text. And it just started a few days ago.

The annotations idea with Responses is not great. You just get a single index point instead of a range, where the AI text production that refers to the chunk has been stripped out of the response. In the assistant message return:

          "annotations": [
            {
              "type": "file_citation",
              "index": 992,
              "file_id": "file-2dtbBZdjtDKS8eqWxqbgDi",
              "filename": "deep_research_blog.pdf"
            },

For your understanding of how file search works in GPT-5, I encourage you to look for the text here, “Do not wrap citations in parentheses”. And also the mandatory enhancement to your billables per tool call. And the false ownership of data given to the user if used for developer knowledge.

OpenAI likely added this to try to discourage reproduction that can happen when things are quoted.

(There was mild trickiness not seen here in getting accurate reproduction of even the citation container characters, for which you likely do not have local fonts forced to represent the utf-8 values not there in “OpenAISans”. Ignore the odd colors by using the only “word wrap” type of block on the forum.)

## Namespace: file_search

### Target channel: analysis

### Description

Tool for searching *non-image* files uploaded by the user.

To use this tool, you must send it a message in the analysis channel. To set it as the recipient for your message, include this in the message header: to=file_search.<function_name>

For example, to call file_search.msearch, you would use: `file_search.msearch({"queries": ["first query", "second query"]})`

Note that the above must match _exactly_.

Parts of the documents uploaded by users may be automatically included in the conversation. Use this tool when the relevant parts don't contain the necessary information to fulfill the user's request.

You must provide citations for your answers. Each result will include a citation marker that looks like this: fileciteturn7file4. To cite a file preview or search result, include the citation marker for it in your response.
Do not wrap citations in parentheses or backticks. Weave citations for relevant files / file search results naturally into the content of your response. Don't place them at the end or in a separate section.


### Tool definitions
// Issues multiple queries to a search over the file(s) uploaded by the user and displays the results.
//
// You can issue up to five queries to the msearch command at a time. However, you should only provide multiple queries when the user's question needs to be decomposed / rewritten to find different facts via meaningfully different queries. Otherwise, prefer providing a single well-designed query.
//
// You should build well-written queries, including keywords as well as the context, for a hybrid search that combines keyword and semantic search, and returns chunks from documents.
// When writing queries, you must include all entity names (e.g., names of companies, products, technologies, or people) as well as relevant keywords in each individual query, because the queries are executed completely independently of each other.
// One of the queries MUST be the user's original question, stripped of any extraneous details, e.g. instructions or unnecessary context. However, you must fill in relevant context from the rest of the conversation to make the question complete. E.g. "What was their age?" => "What was Kevin's age?" because the preceding conversation makes it clear that the user is talking about Kevin.
// Avoid short or generic queries that are extremely broad and will return unrelated results.
//
// Here are some examples of how to use the msearch command:
// User: What was the GDP of France and Italy in the 1970s? => {"queries": ["What was the GDP of France and Italy in the 1970s?", "france gdp 1970", "italy gdp 1970"]} # User's question is copied over.
// User: What does the report say about the GPT4 performance on MMLU? => {"queries": ["What does the report say about the GPT4 performance on MMLU?", "How does GPT4 perform on the MMLU benchmark?"]}
// User: How can I integrate customer relationship management system with third-party email marketing tools? => {"queries": ["How can I integrate customer relationship management system with third-party email marketing tools?", "How to integrate Customer Management System with external email marketing tools"]}
// User: What are the best practices for data security and privacy for our cloud storage services? => {"queries": ["What are the best practices for data security and privacy for our cloud storage services?"]}
// User: What was the average P/E ratio for APPL in the final quarter of 2023? The P/E ratio is calculated by dividing the market value price per share by the company's earnings per share (EPS).  => {"queries": ["What was the average P/E ratio for APPL in Q4 2023?"]} # Instructions are removed from the user's question, and keywords are included.
// User: Did the P/E ratio for APPL increase by a lot between 2022 and 2023? => {"queries": ["Did the P/E ratio for APPL increase by a lot between 2022 and 2023?", "What was the P/E ratio for APPL in 2022?", "What was the P/E ratio for APPL in 2023?"]} # Asking the user's question (in case a direct answer exists), and also breaking it down into the subquestions needed to answer it (in case the direct answer isn't in the docs, and we need to compose it by combining different facts.)
//
// Notes:
// - Do not include extraneous text in your message. Don't include any backticks or other markdown formatting.
// - Your message should be a valid JSON object, with the "queries" field being a list of strings.
// - The message must be sent with the correct header as specified in the instructions, with the recepient set to=file_search.msearch
// - One of the queries MUST be the user's original question, stripped of any extraneous details, but with ambiguous references resolved using context from the conversation. It MUST be a complete sentence.
// - Even if you think the user may have meant something else, one of the queries MUST be their original question.
// - Instead of writing overly simplistic or single-word queries, try to compose well-written queries that include the relevant keywords, while being semantically meaningful, as these queries are used in a hybrid (embedding + full-text) search.
//
// The file search tool will respond to you with the relevant search results from the full available files.
// After you receive results, you should carefully consider each one to determine whether it is relevant and high quality, and use these to inform your answer to the user's question.
// Remember to include citations in your response, in the fileciteturn7file4 format
type msearch = (_: {
queries?: string[], // minItems: 1, maxItems: 5
}) => any;

With more comprehensive fallback fonts, you can see the container the AI is asked to produce has no semantics of “container” either, which is why that also can loop and break:

image

So: directly counter # Tools## Namespace: file_search with your own “developer” version overrides. Or rather, discard the whole vector store scheme, now a per-use fee.