Training with blank prompts

raymonddavey · December 20, 2022, 12:57am

Has anyone done training with blank prompts (where all the info is in the content tag)

I have blocks of technical information about a person, company or product (and general content from our web pages), I want to train GPT to answer questions related to the website content

I have considered splitting the text in half and putting half in prompt and half in content
I have also considered not giving a prompt and putting everything in content
I have also considered making up prompts like “raymonds profile”, “product a” etc

In simple terms, I want to find a way to train GPT on the content contained within our website

Thanks

PaulBellow · December 20, 2022, 6:23pm

I haven’t tried it myself, but I’m told that it can work.

You might want to try both (all) methods with just a little bit of data to see which one works best.

It will still hallucinate and not always stick on topic, though, even with fine-tuning.

Are you creating a chat bot for your website?

PS - welcome to the forum!

jeffinbournemouth · December 21, 2022, 6:28pm

Check this: New and Improved Embedding Model

You can use Weaviate vector db to store your info.

You can then semantically query and generate content using Davinci 3.

raymonddavey · December 21, 2022, 9:27pm

Thanks for the link. I discovered that after posting and have headed in that direction.

The whole concept of embedding was confusion at first - but now I have the hang of it, I can see the real power it offers.

jeffinbournemouth · December 21, 2022, 9:44pm

Game changer for AI powered apps!

bobvanluijt · January 14, 2023, 5:21pm

Hi @raymonddavey –

The whole concept of embedding was confusion at first - but now I have the hang of it, I can see the real power it offers.

We also write about this topic (working with embeddings) on our blog any tips or suggestions for people new to embeddings? E.g., what is the thing you would have liked to read?

Thanks

georgei · January 14, 2023, 7:13pm

Hi, I’d like to jump on the discussion, as I have plans to use weaviate in a product - if you don’t mind.

On embeddings, simple tutorials would be good.
Almost everyone who comes to this forum the first thing they want to do is to fine tune the models to include knowledge base of their preference and the AI model to use it.
Eventually all learn that in most cases the embeddings is the way to go.

One of the common use cases is to create embeddings from a large dataset and not lose context, or otherwise said, when the end is user is sending a request, to make use of the whole dataset rather a single embedding.

For example a novel of 250 pages. Each page would be an embedding, and some embeddings wouldn’t have anything in common, but when an end user makes a request, to combine all necessary embeddings to formulate the correct response with GPT-3.

And another thing is the auth part. During the development I’m using the sandbox without auth implemented becase there is no sensitive information.
I tried to implement the auth, but was unusual complex. Maybe you have a walk through for the people who are not accustomed with weaviate’s auth integration.

bobvanluijt · January 23, 2023, 5:44pm

Thanks @georgei – that’s super helpful and we will def keep this in mind.

I have;

Not losing context with OpenAI embeddings on a large dataset;
Authentication content.

Would you mind making yourself known to me on the Weaviate Slack? We would love to get your help on the above.

georgei · January 23, 2023, 7:44pm

I’ll reach out in the next days, thanks.

nunodonato · January 23, 2023, 8:54pm

cat.constantin · May 21, 2023, 6:53am

If you do not want to loose any information a suggestion is to experiment with how much the information in the vectors should overlap.

Topic		Replies	Views
Train (fine-tune) a model with text from books or articles API	62	26719	November 30, 2023
What to do when fine-tuning is not working? API	21	7797	December 24, 2023
Over-prompting with irrelevant context Prompting embeddings , gpt-4	8	1519	December 17, 2023
Can Model be trained from books? Prompting	9	3317	December 16, 2023
Prompt Assistance , Potentially Fine Tuning oddity Prompting	6	1166	February 7, 2023

Training with blank prompts

Related topics