How to feed data for completions, instead of using prompt/answer fine-tuning format?

SomeUser2022 · January 18, 2023, 6:41pm

I haven’t done it so take this with a grain of salt, but-

If you have three articles (Safety Regulations, HR Policy, Mission Statement), you send each to the Embeddings service, and receive back a mathematical characterization of each.
Safety Regulations [0.01024, 0.0551, …]
HR Policy [0.028590, 0.89784, 0.9845, …]
Mission Statement [0.847, …]

Then, you take your query and get the embedding of THAT.
“are t-shirts against company policy?” [0.98945, 0.402, …]

You do some math between the query’s embedding, and each article’s embedding, yielding three similarity scores.
Query-Safety regulations = 0.5
Query-HR Policy=0.7
Query-Mission statement=0.65
Doing the math is not hard, just ask ChatGPT how to compute the cosine similarity

Take the highest/most similar score, and paste that article’s text into the completion service along with the prompt text
“Info-A productive workplace is important. Professionalism is expected at all times. Proper attire should be worn at all times. Question-Are t-shirts allowed? Answer-”
and, it’ll probably give you some kind of generated response telling you no, t-shirts are not allowed.

In reality, whole articles are probably too coarse-grained and too long, so maybe a recursive process to find the most similar articles, then the most similar paragraphs within those, and maybe the most similar sentences, so when you pass the document texts, you’re not also passing tokens about wet floors in the workplace. But also, maybe the mission statement has something about a relaxed atmosphere, so I wouldn’t pull from just one article

Topic		Replies	Views
How do I "upload" a book to GPT3? API	17	24339	December 13, 2023
Train (fine-tune) a model with text from books or articles API	62	28261	November 30, 2023
How can we make the answer concise with fine tuning? API fine-tuning , api	8	2955	June 7, 2023
The length of the embedding contents API	48	34731	December 13, 2023
FAQ on custom data to support company internal API	27	5402	December 18, 2023

How to feed data for completions, instead of using prompt/answer fine-tuning format?

Related topics