Based on specific Wikipedia page

takeura · January 12, 2023, 11:38am

I’d like to create a chatGPT that prioritizes the consideration of a certain amount of text data information I give it, e.g. wikipedia.
(I know I can’t use chatGPT with the API. If I use the API, text-davinc-003, davinci if I do fine-tuning, etc.)

What should I do to make this wish come true?

For example, if it is about the soccer player Messi, the question could be “What was the last title won by Messi? When and where was it held?” and the ideal answer would be "The World Cup in Qatar in 2023.

Incidentally, we have confirmed that the system works correctly when the relevant part of wikipedia and the previous question are stored in a single prompt.
However, I am considering what to do when this amount of data becomes too large.

I thought about dividing the knowledge into multiple parts like ChatGPT, but from my research, it seems that OpenAI’s API does not store the previous prompt.

I also thought about fine-tuning, but am having trouble figuring out how to prepare the dataset. I don’t want to think about preparing them one by one, for example “the year messi won the world cup”: “2023”.

wikipedia is an example and I will prepare a certain amount of text for me to use.

If you know of any good ideas or examples, please let me know.

takeura · January 15, 2023, 1:25am

rodmin · October 11, 2023, 2:12pm

hey, did you ever figure this out? Thanks.

N2U · October 11, 2023, 2:22pm

There’s many ways to solve this problem, embedding and fine-tuning are good for situations when you want GPT to respond in specific ways, but depending on what you’re most comfortable with I’ll recommend trying the zero shot ReAct agent implementation from langchain:

If you’re still having issues, or building something at scale I’ll recommend having a look at this example from the OpenAI cookbook:

Topic		Replies	Views
How can I fine tune gpt3.5 to be able to read documentation and also books? API	8	2441	December 7, 2023
Fine-Tuning with Non-Prompt/Completion Data: Seeking Advice for Direct Text-Based Training? API gpt-4 , chatgpt , fine-tuning , api	3	435	August 23, 2024
What to do when fine-tuning is not working? API	21	8094	December 24, 2023
Fine-tuning GPT to learn a new coding language Prompting codex , chatgpt , plugin-development , fine-tuning , api	3	3520	December 24, 2023
Fine-tuning 3.5 turbo to act as conversational AI like Non-Playable Character in games API fine-tuning	4	1601	October 4, 2023

Based on specific Wikipedia page

Related topics