Deep Research sourcing unrelated random Ads

I’m seeing what looks like irrelevant source contamination with ads during a Deep Research task that is supposed to be restricted to a Google Drive archive.

Task type:

  • Read a connected Google Drive folder

  • Read manifest first

  • Read multipart shard files in sequence

  • Extract claims/dates/quotes from the archive only

Expected behavior:

  • Stay inside the specified Google Drive folder and named files

  • Use direct file reads after the manifest identifies the sequence

  • Avoid unrelated public web retrieval unless explicitly requested

Observed behavior:

The Sources panel included unrelated external sites such as:

These were not relevant to the task and appeared alongside Google Drive / Google developer documentation retrieval. In the sample I captured, there were at least 23 clearly irrelevant external sources, plus a large amount of Google support/API noise.

Why this is a problem:

  • It adds irrelevant citations/sources to a source-restricted archive task

  • It reduces confidence that the model is actually reading the intended archive files

  • It wastes activity budget on unrelated retrieval

  • It makes it difficult to audit what was truly read versus what was merely surfaced

Suggestion:

Please add a stricter source restriction mode for Deep Research, especially for connector-based archive ingestion tasks. A useful mode would:

  1. Restrict retrieval to specified connector sources only

  2. Disallow public web search unless explicitly enabled

  3. Log actual access level per file:

    • listed only

    • metadata only

    • partial text access

    • full text access

Counts from the captured source list:

commercial product pages

•	support.hp.com: 3

•	hp.com: 1

•	dell.com: 3

•	bestbuy.com: 3

•	Subtotal: 10

random off-task pages

•	apexnc.org: 2

•	sanbernardino.gov: 1

•	elcajon.gov: 1

•	redfin.com: 1

•	wunderground.com: 1

•	mathway.com: 5

•	coinbase.com: 2

•	Subtotal: 13

Combined external sources

•	Total: 23

Repeated duplicate examples

•	Dell XPS 17 9710 System BIOS: 3

•	Apex, NC - Official Website: 2

•	Mathway pages: 5

•	Coinbase calculator pages: 2

This happened twice so not random.

Has anyone else seen Deep Research pull unrelated commercial or random public-web sources during a task that should have remained restricted to a connected document archive?