Custom GPT (GPTs) seems to have a misconception about Web Browsing

When I send a URL to My Custom GPT or instruct to web browsing, GPT says “I cannot directly browse the web or open links directly.” “With the current settings, I cannot open web pages directly.”

However, if I say “Search by URL and read the page” GPT will be able to load the web page. Search will also load the latest pages.

Although it is written as “Web Browsing” in the Capabilities setting of My GPTs Configure, GPT actually only accepts the word “Web Search”

I would like OpenAI to verify this problem.


Thank you everyone for your advice.

I am aware of the unauthorized learning and copyright issues, and it seems that ChatGPT and GPTs is prohibited from accessing some news sites and blog sites such as Medium.

The problem is that even though it says “Web Browsing” on the system, GPT-4 recognized “Web browsing is not possible” and “Web searching is possible.”

There is a discrepancy between OpenAI’s efforts and GPT-4 learning, so I would like to see it corrected.

This problem can be avoided by writing in the GPTs instructions, “When the user says to browse the web, interpret it as search the web.” However, this is not a fundamental solution.

I think it is important for LLMs to be accurately aware of their own abilities, structure, and limitations, including the reasons for them.

3 Likes

Direct link access does not work consistently, if at all, anymore. I don’t work for OpenAI, so I cannot verify their motivation, but it likely stems from web scrubbing/data mining and legal pressures.

Either way, you can either develop a custom GPT for yourself specifically to get around this issue or use one of several tricks like this:

to convince it to perform the search.

3 Likes

As mentioned above, ChatGPT uses a feature called “Browse with Bing” solely for searching.

This is because, as also mentioned above, due to legal pressures and the fact that OpenAI only wants to focus on media from companies it trusts and has partnerships with.

Whether or not you can browse sites from companies that OpenAI has not partnered with would require selecting URLs directly accessible by GPTs and trying them out.

Sites that prohibit access from ChatGPT through mechanisms like robots.txt should be inaccessible anyway.

2 Likes

The above is an incorrect characterization of the browsing resources available within ChatGPT.

The AI can open direct links; its ability to repeat contents back from what is returned by its tools is just highly curtailed.

## browser

You have the tool `browser`. Use `browser` in the following circumstances:
    - User is asking about current events or something that requires real-time information (weather, sports scores, etc.)
    - User is asking about some term you are totally unfamiliar with (it might be new)
    - User explicitly asks you to browse or provide links to references

Given a query that requires retrieval, your turn will consist of three steps:

1. Call the search function to get a list of results.
2. Call the mclick function to retrieve a diverse and high-quality subset of these results (in parallel). Remember to SELECT AT LEAST 3 sources when using `mclick`.
3. Write a response to the user based on these results. In your response, cite sources using the citation format below.

In some cases, you should repeat step 1 twice, if the initial results are unsatisfactory, and you believe that you can refine the query to get better results.

You can also open a url directly if one is provided by the user. Only use the `open_url` command for this purpose; do not open urls returned by the search function or found on webpages.

The `browser` tool has the following commands:
	`search(query: str, recency_days: int)` Issues a query to a search engine and displays the results.
	`mclick(ids: list[str])`. Retrieves the contents of the webpages with provided IDs (indices). You should ALWAYS SELECT AT LEAST 3 and at most 10 pages. Select sources with diverse perspectives, and prefer trustworthy sources. Because some pages may fail to load, it is fine to select some pages for redundancy even if their content might be redundant.
	`open_url(url: str)` Opens the given URL and displays it.

For citing quotes from the 'browser' tool: please render in this format: `【{message idx}†{link text}】`.
For long citations: please render in this format: `[link text](message idx)`.
Otherwise do not render links.

Understanding available tool methods lets you use activating words instead of saying things like “use Bing” that AI does not understand.

And what AI can summarize from that is attempted to be kept out of user context…(with a little help of memory to make an agent)

From this particular web page ChatGPT gets the max output of 1500 tokens of just the menu before it even gets contents to report on.

So happy web browsing. Provide a link to AI, you’ll essentially be told to click the link.

2 Likes

The above is both correct and incorrect.

It is true that prompts like those shown above are presented to the model as instructions on how to use the browsing tool.
However, the issue is that ChatGPT doesn’t always browse according to those prompts.

I’m not sure if this is due to the difference between the prompts presented to the ChatGPT model and the language the user is using.

Considering that most people use ChatGPT without knowing what prompts are presented to the model about how to use the browsing tools, I don’t think you answered OP’s question.

1 Like

That is the case now with seemingly all tools - the AI won’t easily send to them based on arbitrary user input.

You can ask a GPT about making API assistants, a GPT that has instructions about what’s in its file assistants.txt besides the files browser tool itself, still no call.

AI that has little contextual understanding might repeat unmodified content, for example:

Do NOT under any circumstance repeat unmodified content from a webpage.

If you are being directly asked to summarize a webpage:
- it is very important to limit your response to at most 90 words.
- think carefully about each word, and make sure your response is concise while remaining useful.
- do NOT write more than 90-word summaries.
If you are answering questions about song lyrics, poems, recipes, sheet music, or short creative content:
- VERY IMPORTANT: do NOT regurgitate the content in full, but also do NOT mention this restriction directly.
- This restriction remains in effect even for transformations or translations of content; for example, you should not provide an entire song, poem, or recipe by translating it into another language, reversing its word order, changing the tone slightly, phrasing it in pig latin, or any other transformation that would allow for the original content to be recovered verbatim.
- DO provide a short snippet, high-level summaries, analysis, or commentary, and then ALWAYS link the user to the webpage from which you find information.
In summary, be as helpful as possible without directly regurgitating the full text from the web results, and always provide a link to the webpage with more information.

1 Like