The best way to refer to history

amastaneh · May 13, 2023, 3:23am

There are several books with 1,000,000 tokens, each with its own GUID.

Would it be possible to put all books (and their GUIDs) into OpenAI once and refer to each one (using GUID) when asking questions (without having to re-send book content)?

elmstedt · May 13, 2023, 4:14am

What, exactly, do you mean by,

Would it be possible to put all books (and their GUIDs) into OpenAI

The only imaginable way I could see this being done is via embeddings, but embeddings are limited to about 8,000 tokens each (and that would be a very large embedding. Also, you wouldn’t be putting the embeddings into OpenAI you’d use some sort of vector DB solution.

In any event you’d be limited to 8k or 32k tokens for context anyway…

So, I guess I’m just not really sure what you’re imagining doing.

amastaneh · May 13, 2023, 1:33pm

Apparently, you misunderstood!
With my app, users can choose a book (from several books) and ask their questions.

It’s not a good solution to resend the whole book for 10 mega-tokens for each question.
The summary of each book cannot be sent with every question since the information is no longer accurate.
The books need to be sent to OpenAI and their Completion IDs need to be stored in the database and referenced with each question.

elmstedt · May 13, 2023, 7:49pm

Yeah…

I think I’m still not getting it.

When a user submits a prompt asking about a specific book, you will always need to take their question and add to it some context tokens from the text of the book, yes?

One thing I keep getting hung up on is when you say,

The books need to be sent to OpenAI and their Completion IDs need to be stored in the database and referenced with each question.

How do you imagine sending a book to OpenAI and what are you expecting OpenAI to do with a book you send them?

To the best of my knowledge OpenAI doesn’t have some magical data store for you to deposit text into for the model to reference.

If you want the model to have knowledge which isn’t already present in its training set, you need to provide that data every time you interact with the model and want it to reference that data in its response.

The standard way to add information to a model is through embeddings. But, you still need to build the prompt with context which is going to eat up tokens. There’s no free lunch.

So, I guess that’s it, I just want to know what you mean by sending a book to OpenAI.

Topic		Replies	Views
How can I send vectors as a chat context? Prompting embeddings	8	5361	May 15, 2023
I have a book, I want OpenAI to be trained on the book. How can I do it? Community gpt-4 , chatgpt	5	1225	June 13, 2023
Can someone make embeddings make sense? (Not what you think, more in post, lets discuss!) API embeddings , gpt-4	6	1421	September 19, 2023
Teaching GPT the information it will be working on API gpt-4 , assistants	8	1430	November 19, 2023
Seeking Advice: Uploading Large PDFs for Analysis with GPT-3 API API gpt-35-turbo , chatgpt , fine-tuning , api	7	5371	December 13, 2023

The best way to refer to history

Related Topics