Using my own knowledge base with GPT-4

I am building a Chatbox for IT at my college. It will be a live chatbox where clients can ask questions. We have a college knowledge base, and I was wondering how I could integrate that with the GPT-4 engine?

Welcome to the OpenAI Community.

When it comes to answering questions from large knowledge bases. Embeddings + Completions (gpt-4) is a possible solution.

1 Like

I am looking for something similar for my organization. We are trying to figure out how to take our organizations constant updates and communications and incorporate them into a chatbox anyone could talk with. The main thing we would be looking to build though would be an easy website for our communications team where they consistently and easily upload content that constantly adds to the models knowledge base.

Same here. From what I understand, though, it would cost a lot of money to have an AI engineering set up the custom integration then you got to pay a whole bunch of people to rate the feedback until the model is accurate. I’m not technical, so maybe I’m completely wrong. But my understanding is that we are a fair bit off from being able to create bespoke chat agents with our own data using ChatGPT.

Can anyone add some insight?

Using your own knowledge base in a web app is not so straightforward.
The chat part is easy, but you also need to have knowledge management using third party services.

And GPT-4 used in a web app is expensive compared with the free/paid ChatGPT.
But GPT-3.5 is decent too for this purpose.

What I find as an obstacle is the time consuming process of learning the APIs.

Probably there are tutorials on how to do this.

1 Like

Thanks @bill.french. Given me some direction on where to look to next. Appreciate it.

1 Like

I agree. This part is best described as everyday software engineering. :wink:

This depends on your skills. The APIs (for many) are not challenging. Engineering a process that successfully wraps AI in the process to achieve a reliable and financially practical outcome is generally the bigger challenge.

Embeddings and cosine similarity are not as complex as they sound, and you can even learn how to build code that does this in Google Apps Script, a script runner that comes free inside every Google spreadsheet container.

This is why embeddings are so important. They serve to give you very powerful inferencing at 1/600th the cost of GPT-3 inferencing. And to be clear, most AI applications do not need GPT-4 or any chat-like behaviors. Far simpler models can provide very powerful solutions at near-free inferencing costs if you put the time into designing the solution’s approach.

1 Like

ElasticSearch is a wonderful way to start. It has it’s own built-in scraper and UI to organize the documents. GPTIndex is also a powerful tool.

I’ve been tinkering with a synergy between graph and vector databases for a week or so now. Using the vector database results as pointers. Still know nothing so if anyone else has more experience, I’d love to hear about it.

2 Likes

Great topic! Thank you for sharing!