We need source links and we need them now

The example I gave is not from an advertisement. It is from a real “chat” session I ran tonight on Bing AI Search.

1 Like

Ah. OK. What was your “one sentence prompt”?

I did not test yet, only have read the marketing releases and writing commercial code for clients. Sorry if I missed it was live for testing now, my bad, multi-tasking too much. Maybe it’s only available in the US?

I want to test with your “one sentence prompt”. Please post back with the text (no images) to copy-and-paste into Bing.

Thanks!

:slight_smile:

Well, it seems I cannot test the Bing AI Search directly as it’s not available here in my country I guess.

However, from searching the net, here is what it says:

Microsoft has gotten around some of ChatGPT’s limitations by marrying OpenAI’s language capabilities to Bing’s search function, using a proprietary tool it’s calling Prometheus. The technology works, roughly, by extracting search terms from users’ requests, running those queries through Bing’s search index and then using those search results in combination with its own language model to formulate a response

This is what thought happens. Bing searches and feeds the search resulting into Prometheus which formats the results and uses the chat bot to make the language output (the text) “pretty”.

The bulk of the work is from the search engine. The chatbot is used to make the results look and sound pretty to humans, so it seems. This is a very different process than using the AI to do the work. The search engine does the work, creates the index, etc. and sends it to some chatbot processes which do the natural language to the end user part.

See:

MIcrosoft announced a couple days ago in the US that you could sign up for a Bing AI wait list and that priority would be given to those who set Bing to their desktop browser and also downloaded the mobile Bing app.

The Text I used is:

create a pro con table from a pubmed search about epidural steroid injections

1 Like

Thanks for that. Now I see why I cannot access to test.

Cheers and thanks again. I’m off to the gym.

That is similar to the WebChatGPT Chrome extension for ChatGPT.

Of course since Bing does it behind the scenes the result seems more global/consistent with Bing.

1 Like

If this is true, man, what an AI DeepFake M$ has pulled off. It sounds similar to something we can do now with the API: context search results.

  1. You embed your content.
  2. You enter a search prompt and vectorize it.
  3. You run vector calculation between prompt and content to return the 3 highest results.
  4. You query the OpenAI model with the original search prompt and include the text of the highest results as “context” for the prompt.

I mean, I was planning on doing just this as a solution to returning source citations from AI query results. This, everybody can do now.

1 Like

Correct - but way more effective when it is native to the browser

Hi @ruby_coder. I wonder if you have any insight about the number of tokens from the search results that are being included in the prompt? That’s been one of the trickier parts of building an expert system in my experience, and I am wondering if we can expect an announcement soon that GPT-3’ts oken limit has been increased. I agree with your point below that the combination of search and completion in a web browser is incredibly powerful. Thanks.

1 Like

Here’s a hacky heuristic for fact checking:

  1. Submit your prompt and get a reply.
  2. Copy/paste individual facts into Google. Your prompt in (1) could have specified a format to make parsing easier to automate.
  3. Search Google on each fact. Programmatically if possible.
  4. Submit the fact check from Google results into a GPT prompt that basically asks the same original question, but has a better context because presumably the Google search result is “accurate”.

Clunky and probably in violation of somebody’s site terms.

So yeah, I agree. Microsoft and Google: Please automate this fact checking process!

The world will be so much better and this technology will be adapted so much faster.

I watched this webinar discussion: Beyond Semantic Search with OpenAI and Pinecone - YouTube

The model they demo’d here is perfect. You do vector searches (on embedded data), you get back the top results with sources from the documents you indexed. And, users can select which groupings of those documents they wish to choose. Sweet. This was developed a year ago!

When I first posted this 3 months ago, I was fairly clueless about the chat completion process. Since then, I’ve learned that process and coded a few completion chains. Now that I understand the process better, I also understand how citations can be included in responses – at least with respect to semantic searches using your own data.

First, you embed your data:

Then you build your chat completion chain:

So, in this process, a user asks a question. That question is embedded and submitted to your vector store for a similarity search. The search returns relevant info (docs relevant to the question asked). You then send the question + relevant info to the LLM model in a chat completion API call for an answer.

Your citations are essentially the relevant info you send. So, you only need ask the model to list the titles of the relevant info you sent it – or, better yet, in your chain code, you list them along with the answer you send to your user.

These graphics are from this excellent LangChain quickstart tutorial video: LangChain Explained in 13 Minutes | QuickStart Tutorial for Beginners - YouTube

1 Like

Based on that OpenAI-Pinecone webinar from last year, this is the screenshot I took of their user interface. It served as the model I would aspire to for the next several months:

And, below is the user interface I finally developed:

As you can see, citations are listed. There are also “library” selection options (groups and tags) as well as options to fine-tune some chat-completion elements (concepts, standalone question, hybrid search, etc…).

It was a long road, but I finally got there.

1 Like

Summary created by AI.

The conversation on the forum revolves around the need for AI models to provide source links for verification purposes. SomebodySysop kick-starts the discussion by mentioning the potential new Google AI that could outrank OpenAI due to its ability to provide sources of information. This sparks the debate about how OpenAI models, being predominantly GPT-based, generate language completions not checked references.

ruby_coder asserts that all GPT-based systems rely on human verification while discussing the supposed superiority of Google’s AI. PaulBellow contemplates whether Google, with its vast amount of internet data and alleged connections to other existing systems, may have an AI superior to OpenAI’s.

The back-and-forth continues, questioning the inability of OpenAI to return source links. There is an agreement that GPTs function as auto-completion engines, often generating nonsensical results and thus requiring everything to be verified by the user ruby_coder.

The discussion then drifts toward the practicality of using AI in business. SomebodySysop argues that without a verification feature, ChatGPT is less useful for business applications due to its unreliability SomebodySysop. However, ruby_coder opposes this view, stating that as a professional programmer, they find ChatGPT significantly beneficial in enhancing productivity ruby_coder.

Further down the discussion, users examine how the new Bing search functions with ChatGPT and how it includes source citations in search results, as shared by SomebodySysop. lmccallum presents two approaches of obtaining sources from AI models: explicitly asking for sources or using GPT-3 as part of an NLP pipeline.

The dialogue about OpenAI and source links continues, with potential solutions and strategies shared among users like dragidude. Towards the end, SomebodySysop shares that they have learned to create verification chains to provide citations in response using external data and document embedding SomebodySysop, SomebodySysop.

Summarized with AI on Dec 24 2023
AI used: gpt-4-32k