The ChatGPT User Interface provides with each answer with really valuable web sources. I want to provide the same for my users, but the chat completions API does not seem to provide any citations.
I have seen that the assistants API does have citations, but it seems to be only for RAG files.
Any idea how I can create the same user experience with citations of web resources in my application using the chat completions API?
You indeed would need to create the same user experience. ChatGPT and Assistants both have internal OpenAI tools.
The linked post which you can expand goes into a bit of a tool call demonstration. You would need to find a web search API that can service requests for you, like Bing (expensive) or some other search scraper libraries that are out there.
The challenge is actually exploring the web to get full-page results on modern dynamic sites beyond search descriptions.
Citations would just be talking to the AI about some footnotes to produce, or some special marker format within its answers it should make for references, where you can rewrite and attach the resulting web links when you parse and render the answer.
TL;DR: System instructions, functions, linked function returns.
Thanks for your reply. I was hoping that I was missing some parameter that would automatically allow that.
I do have a function call that does the search. It first searches Google with appropriate search terms, then returns the webpage texts of the first 5 hits including the url each text belongs to.
tools = [{
“type”: “function”,
“function”: {
“name”: “search”,
“description”: “Search the web for current information. Give a search query, and get back the top 5 search results.”,
“parameters”: {
“type”: “object”,
“properties”: {
“query”: {
“type”: “string”,
“description”: “The search query”
}
},
“required”: [“query”]
}
}
}]
I guess I will then need ask the chat completions API to return a citation for each piece of information taken from any of the webpage texts. In my view I then need to display that appropriately.
You get the actual contents of a webpage along with results. It also uses embeddings for semantics with it’s neural search. You can get the contents of subpages, get highlights per url, a summary, images, etc.
Surely that’s how all web search engines work? Can’t imagine keyword search is a practical consideration for web search
I quite like Jina.ai. They have pay as you go pricing (great for small scale use) and results can be returned in Markdown they also provide page crawling.
The first step is simply instructions of how to produce a response.
Here’s a hypothetical to get you thinking.
When utilizing the `web_search` tool results to generate sourced information, incorporate the results into the response in a natural and polished manner, but do not disguise use of internet search. Key details or concise excerpts can be reproduced verbatim using markdown highlights (~~) to distinguish them clearly within the text. Links to the external sources should be seamlessly embedded within the content. Identify a highly relevant keyword or phrase that connects naturally to the referenced information, and format it as a markdown link using `[Keyword](URL)`. This ensures the link is integrated smoothly, maintaining the flow and readability of the response.
That would be if you want to furnish search results into a HTML interface of a browser. You can use your own technique for having AI produce the appearance. Such a long message can still be part of a main function description.
You may have a more advanced use, like an application that has tooltips and a context menu “copy link”, “discuss link”, “summarize page” where you give widget IDs, result indexes, or others to the AI. Make your own tool return format and AI response format that your renderer can identify and employ. The assistants citation format can serve as example.