Requesting OpenAI insight into search

With hundreds of questions open on the general topic of “searching” I’m hoping OpenAI can share some insight into how search generally works so that we can set expectations and refine our approaches - not just in a response here, but as an ongoing policy.

For example:

When we ask ChatGPT to “search the web”, what is its exact order of operations?

  • Search user chats first, if available?
  • Reduce search concepts expressed in prompt to minimal keywords?
  • Send query to Bing? Google? Local OpenAI storage?

Knowing searches are by keyword and not semantic compels a user to spend less time explaining what they want so that they can focus on their own keywords. If searches are semantic, how can we get the distilled search query that a model might create so that we can better refine a prompt? If we know that a search for “best BBQ in Texas” is reduced to “good rib restaurants” then we can take action to refine the search. Without that information we don’t know the basis for the search results and thus the basis for an assistant response based on those results.

Where exactly does OpenaI get its search data? How often is the web polled for freshness? If keyword searches are sent to Google and then cached, maybe we should be doing our own searches and returning our own refined results to an assistant for actual processing.

Knowing the engine facilitates choices. For example: I am not fond of Bing and prefer Google. I have no idea what OpenAI does but I suspect there are fewer results retrieved and cached. So I’d preter the assistant use a Google search via SERP or some other API. But maybe OpenAI queries both engines and others and de-dupes and caches. With no knowledge of the processes I can’t make informed choices. I can only hope that search results are current, sufficiently diverse, and sourced from a rich pool of data.

If local searches performed first add bias to internet queries then we’re facing the same problem we all have with search results that tend to reinforce a world view based on prior searches. I don’t want that status quo, I want data from the board pool of internet content with bias-free filtering for what I asked for, not for what some entity thinks I want, or what some entity prefers that I see.

Related : If there’s no local influence or bias, the exact same prompt from any two people should return fairly determinate results. This won’t be the case if OpenAI is sourcing from localized engines, but it might be the case if OpenAI sources and de-dupes from multiple localized engines (google.co.uk, google.in, bing.de, bing.com.au …).
It would help to have some insight into how OpenAI chooses (or doesn’t choose) data sources for live queries or storage.

We actually know that there is some bias in searches because we are told in Settings that user Memory can or will be used in searches. How can a user see the exact search engine queries that have been posted to remote search engines, and the exact responses that came back?
This is important : someone doing a search for medical information for a family member will get results based on their own personal memory data. The average user might just ask “what’s a good cure for a cold” without providing the imporant context that they’re looking for someone else. Knowing how searches are performed and the data sent and received, can be critical for a significant scope of use cases - personal, business, medical, technical, etc.

We can modify our ChatGPT searches with syntax like “site:example.com”. Does this tell us that ChatGPT uses Google? Or does it tell us that OpenAI has replicated this functionality for more specific searches? What are the limits to this syntax? Why do we need such syntax if we just specific the example.com site in a prompt? Should we be aware of other syntactical tricks like this (from Google or Bing documentation) to modify how ChatGPT does searches?

Are answers to these questions the same for the OpenAI web search API?

When a reasoning model searches and pulls back responses, then considers the data, how do we know what data is being used by the model behind the scenes to formulate its final response?

Do answers to any of these questions change with the model or based on a model’s training cut-off date?

Will Projects be enhanced with search queries processed to include project-specific instructions? This isn’t a question about future development, it’s intended to get OpenAI to include consideration and documentation for such things in all such updates.

I understand that this is a long note and that all questions can’t be answered here or perhaps elsewhere. What I’m trying to do is to establish a base for transparency on this topic. I’d like OpenAI to be more aware that the mystery of this significant component can be as much a liability as an asset.

Content-seeding for bots and web scrapers isn’t a widely recognized thing yet. But it will be soon. SEO and search engines have a love/hate relationship regarding keywords and phrases can, should, and should not affect processing. There are and will be more websites that seed their content specifically for ChatGPT and other user assistants, for their own purposes, sometimes nefarious. Imagine meta tags in web pages, that direct people to harm themselves, being scraped by ChatGPT to process a self-help prompt. We are there. But we do not have any insight into what happens with our searches to know how to handle this.
Let’s be proactive.

OpenAI - please help the developer community to understand how this works so that we can create better client-side tooling. Help us to reduce the chances of bad things happening when we accept text from average human beings, send it to you for processing, and then return text that we can only hope won’t get us all into trouble. And let’s help ChatGPT users to understand how this works too.

Thanks.

Disclaimer: This response was curated with a mix of my own thoughts along with some inputs from ChatGPT. I aplologize in advance for mispelled words, I wrote this in a hurry.

While I am not OpenAI, I hope that my response proves useful to you and other fellow users / developers of OpenAI’s tools and products. My response doesn’t necessarily answer all of your questions, but it may shed light on what you and others are asking. Also, I learned some new things today. Thank you.

(1) ChatGPT + Web Search

When ChatGPT is instructed to “search the web” (assuming browsing is enabled), the order of operations typically follows this general process:

  1. Query Formulation – The model interprets your prompt and generates a keyword-based or semantically relevant search query.
  2. API Request – This query is sent to a third-party search engine API, typically Bing’s Web Search API.
  3. Result Retrieval – The engine returns a limited set of top-ranked search results.
  4. Content Extraction – ChatGPT accesses the actual web pages behind those links and extracts textual content (not images or scripts).
  5. Summarization – The extracted content is then analyzed and summarized in the context of your original prompt.
  6. Response Generation – The model crafts a coherent response, integrating the fetched information with any relevant prior knowledge.
  7. Citation (optional) – In some tools or UI modes, a source link or citation may be attached to provide transparency.

It’s important to note that ChatGPT does not retain or cache live search data. Each browsing query is treated independently, with no persistent web index on OpenAI’s side. Also, the actual search query and sources retrieved are not currently visible to users, which can impact transparency.

(2) A Begginers-Level View on Internet Architecture… [because I’m not an expert]

The internet architecture is a vast network of interconnected servers and clients that communicate via standardized protocols. When using tools like deep search, queries can be routed through multiple engines and databases in real time, allowing access to a broad range of sources. This means you can open links using any browser, not limited to Google or Bing, as the search tool aggregates data from diverse locations.

(2.1) Priotization of Search Results

Google’s indexing process involves crawling the web to collect information about pages, then categorizing them based on factors like content relevance, keywords, and metadata. The ranking system evaluates signals such as click through rates, back link profiles, loading speed, and freshness of content to determine the order of results. Unlike traditional search engines, OpenAI’s ChatGPT deep search uses semantic understanding and contextual analysis to interpret queries and retrieve information, which can differ from keyword-based rankings.

(2.2) ChatGPT + Your Search Query

ChatGPT does not automatically distill prompts into a single reduced search query like “best BBQ in Texas” becoming “good rib restaurants,” unless the underlying model is designed to interpret and generalize that way. In general, if search is invoked (e.g., via Browsing), the process follows an internal sequence where the query is structured to maximize retrieval effectiveness. However, that structure is not visible to the user.

As for the engines used, OpenAI’s browsing tools have typically used Bing’s Web Search API under the hood. However, this can change based on agreements and system updates. Unlike traditional search engines, the results are parsed by the model for content extraction, rather than simply linking back to ranked results.

Your comment about how local memory → this definitely inflicts bias in your search outcome. When memory is enabled, ChatGPT may tailor the prompt, the retrieved results, or the summary based on past conversations. Disabling memory or using incognito sessions can help mitigate this bias if unbiased search is critical. (See Section 3 of this response for more info about bias).

Currently, there is no user-facing dashboard that exposes the raw query sent to external engines or the exact raw results received. This limits auditability and transparency, especially in critical or high-stakes domains like medical or legal advice.

The “site:” syntax does not confirm Google is used → it is mimicked functionality within the prompt interpreted by the model. It can help nudge the model to prefer or filter content, but its effectiveness varies. Also note: these issues can vary by model, based on its training cut-off, browsing access, memory state, and the interface being used (chat, API, etc.).

(3) About Bias [because we are all human]

Bias is a systematic preference or inclination that affects judgment, behavior, or data (consciously or unconsciously). In our world of information systems (whether through digitized media or physical), this manifests when certain perspectives, patterns, or values are favored in the presentation, processing, or interpretation of information.

Types of Bias Relevant to Online Content:

  1. Author bias – A creator’s worldview, values, and assumptions shape how information is framed or excluded.
  2. Selection bias – Only certain facts or voices are highlighted, while others are left out.
  3. Algorithmic bias – Automated systems prioritize content based on engagement signals, historical behavior, or opaque ranking logic.
  4. Confirmation bias – Users and systems both tend to favor content that aligns with preexisting beliefs.
  5. Training data bias – AI models inherit patterns and prejudices from the data they’re trained on, especially when the internet is the primary corpus.

Much like grammar shapes the structure of language, bias shapes the structure of meaning. Every communication reflects a viewpoint… even when it claims neutrality. Bias is not inherently negative; it becomes problematic when it goes unrecognized, unchecked, or disproportionately influences outcomes without transparency.

The goal, then, is not to eliminate all bias (which is likely impossible), but to increase awareness, create better tools for surfacing multiple perspectives, and ensure accountability in the systems we use and build.

(4) How OpenAI Handles Search [Most likely… because I asked ChatGPT]

OpenAI sources live search data primarily through partnerships such as Bing Web Search API. This browsing feature is only available when enabled, such as in Pro-tier models with browsing tools. The system sends real time keyword-based queries to Bing and returns top-ranked content for the assistant to parse. OpenAI does not use Google Search directly due to licensing constraints. :frowning:

The web is not continuously polled or indexed by OpenAI itself. Instead, OpenAI relies on third-party APIs like Bing to return current results at query time. There is no global cache of the web that OpenAI maintains independently for browsing (this is wild ->) though the model itself has a static knowledge base tied to its training cut-off.

:double_exclamation_mark: If you suspect cached results, it’s typically due to the third-party search engine’s own index, not OpenAI’s design. For the most up-to-date data, developers may want to perform their own searches and pass relevant links or excerpts to ChatGPT for interpretation.

(4.1) How OpenAI Handles Search - More Specifically

The assistant does not query multiple search engines in parallel or deduplicate results from across engines like Google and Bing. The chosen provider (usually Bing) determines the scope and ranking of data retrieved. This makes the search dependent on a single source’s indexing and filtering bias.

Local memory can shape how search results are processed and interpreted, but it does not affect the raw results fetched from the external engine. That said, any follow-up summaries or actions by ChatGPT can reflect previous user behavior or stored memory unless disabled.

Without visibility into the full query and result chain, determinism in responses is limited. Even the same prompt issued by different users may return divergent results depending on regional engine localization, session context, and model configuration.

(4.2) Query Syntax & Browser Selection

The “site:” syntax mimics search-engine-level filtering but does not prove Google is used. It helps refine intent but has limited power when passed through OpenAI’s API layer, which interprets text before forwarding a query to Bing. Using “site:” or similar tricks is helpful but not consistently reliable across contexts.

These behaviors can vary by model, cut-off date, and whether the browsing tool is enabled. Search via GPT-4 with browsing differs significantly from GPT-4 without browsing or GPT-3.5.

If you’re refering to the chrome extension that allows you to “Search with ChatGPT”, well… you’re in luck. Anyone can build a chrome extension with customized behavior and ranking method. And, Chromium (an open-source web browser project) is the backbone for many modern web browsers like Google Chrome, Microsoft Edge, Brave, and Opera.

Here’s a list of web browsers that support user-built extensions:

  • Google Chrome – built on Chromium, with full developer tools and extensive documentation.
  • Microsoft Edge – also Chromium-based, supports the same extension format as Chrome with some Microsoft-specific APIs.
  • Brave – based on Chromium and compatible with most Chrome extensions.
  • Opera – allows custom extensions and supports Chrome extensions via an addon.
  • Firefox – supports WebExtensions API and has a strong developer community.
  • Safari – supports extensions via Xcode using Safari Web Extension Converter and native Safari App Extensions.

All these browsers provide APIs and documentation for building extensions that interact with web content, tabs, context menus, and more. The process usually involves a manifest file, background scripts, and optionally a UI element like a popup or sidebar.

(x) Recommendations

  • Search capabilities are not yet tightly integrated into OpenAI Projects, so enabling scoped search or custom indexing per project would enhance context-aware development. (Add this as a recommendation to OpenAI in one relevant threads.)
  • OpenAI Projects do not yet support search-context binding or scoped indexing, but this is a common request and would allow tighter relevance filtering if implemented in the future.
  • Transparency into query logs, raw result sets, and processing layers would improve developer trust. AND, a structured search dashboard or audit trail would allow teams to validate outcomes, aligning results with responsible AI goals.

~ P.S. I chuckled while reading through the begining of this post. It was so long that I decided to use my speechify extension, listening to it through the end. I highly recommend asking these questions to ChatGPT-4o :slight_smile:

This topic was automatically closed after 21 hours. New replies are no longer allowed.