Expanding GPT domain knowledge

villiwallace75 · June 22, 2023, 6:38am

Sorry for the very general question but I’m just starting to work on it now. Is it possible to extend gpt knowledge with technical domain information?

Let me explain. I work in a very technical field and would like to get more in-depth answers from gpt, as if it were a real domain assistant. Unfortunately, what I get are superficial answers, and I guess that is completely normal. Is there any way to provide it with material, tell it what documents to read or what sites to visit to improve the quality of content?

If so, then I will ask you how to do it

jochenschultz · June 22, 2023, 7:11am

Build a system that searches for fitting documentations and crawls and embedds them into an ada model. When that process has finished you can start searching on it.

villiwallace75 · June 22, 2023, 10:21am

Thank you so much @jochenschultz . Does not seem like an easy thing to do, but maybe I can get a budget from my company. I’ll update.

jochenschultz · June 22, 2023, 10:31am

If you are interrested we can do that together in a video session. I could imagine a few more ways to solve that.

You could add some context to your prompts by getting the data from another system first e.g.

a wordbubble/knowledge tree in a RDBMS
a graph DB that stores relations between documents which is filled over time by crawlers
a vector db
maybe even live crawling and summarizing
a combination of above

You can even build a database extension e.g. for postgresql that enables your database to understand SQL like this:

SELECT content from querypool WHERE prompt=‘…’

That would be like a stored procedure with access to different architectures (even a pool of local PDFs)…

… and of course utilizing other/own models. Although there is not much even slightly as good as the openai world

fzliu_zilliz · June 22, 2023, 9:55pm

The most common way to do this is with a vector database like Milvus, but as @jochenschultz mentioned, relational and graph databases are starting to be used as well. One of the most common ways of injecting domain data into GPT-3.5/4 is with Llamaindex.

Some blogs that might be helpful for this:

Retrieval Augmented Generation: Streamlining the creation of intelligent natural language processing models
ChatGPT+ Vector database + prompt-as-code - The CVP Stack - Zilliz Vector database blog (Disclaimer - I work at Zilliz).
https://milvus.io/docs/integrate_with_llama.md

Topic		Replies	Views
How to give domain knowledge to chatGPT API	5	7592	December 24, 2023
Combing domain facts with model generated text API	1	537	April 26, 2023
Giving GPT4 a contextual database alongside prompt/answer examples Prompting	1	1540	April 15, 2023
Train My GPTs Properly using external sources GPT builders gpt-4 , chatgpt	5	2056	July 19, 2024
Can we use URL as a knowledge for answer? GPT builders	1	106	May 20, 2025

Expanding GPT domain knowledge

Related topics