Using the model with a pre-defined set of answers

colin.rathbun · December 30, 2022, 1:07pm

Hello everyone! Have been blown away in the playground but looking for some fine tune guidance.

Can i use embeddings or fine tuning to give a language model a specific set of answers?

Here is a sample case to illustrate what I’d like to do in categorization/classification:

Let’s say I want Open AI to tell me the color of an object. The prompt would be something like “What is the pantone color of the sky?” There are 1826 pantone colors.

What if I wanted my answer to be within only 400 of these pantone colors and the model to return the closest color label and NOT give any of the other 1426.

Can this be achieved? And what general steps would need to be taken to complete this?

For the record, I’m not trying to classify colors, but I figured this would illustrate well what I’m trying to achieve.

nelson · December 30, 2022, 6:43pm

Hi @colin.rathbun

I think there is 2 ways on approaching this problem.

Fine-Tuning
You can try the fine-tuning API, it will probably give you what you are looking for, but depending on your context, it will always be better and never 100%
Semantic Search + Prompt Engineering before the API request
You can use Embeddings API to detect the user’s question and submit your ideal answer in the prompt. For example you can do the following…

If there user ask about Pantone Colors, tell them that we only have the following 400 colors and see our link at https://....

Q: What kind of pantone colors do you have?
A:

For instruction on how to do semantic search, you can refer to this.

colin.rathbun · December 30, 2022, 8:21pm

Thanks @nelson.

For the fine tuning, do I just give 400 examples each with one of the 400 pantone colors? Maybe more?
For the semantic search, that prompt does not ask or get the answer I’m looking for (unless I am missing something). I’m looking for a single pantone color response… Only one of the codes I have provided…

nelson · December 31, 2022, 1:15am

@colin.rathbun

It really depends on your use cases, I suspect you are going to have more than 400 records, see this example here…

{"prompt": "What are your most popular yellow paints for bathroom", "completion": "Our most popular paints are Yellow, Pantone 101 and Pantone 102"}
{"prompt": "What is your top selling paint this year.", "completion": "Our top seller is Pantone 101"}

The semantic search is done using the Embeddings API, there are going to be some programming involved. But the idea is there you can store and search documents very quickly and feed them into the prompt along with the actual question. It’s bit more complicated than the fine tuning API, but if gives you a lot more flexibility.

Sounds like a very interesting project, good luck, let me know if you run into any other roadblocks.

Topic		Replies	Views
Fine-tuning a model so it always answers with an answer from a training file API	3	1639	January 31, 2024
Fine-tuning Function (not with direct answers) Community api	3	301	January 30, 2024
Fine tuning using a corpus API api	8	2053	July 13, 2023
How to provide "context" in a Q&A chatbot Prompting	12	11754	December 20, 2023
Fine-tuning 3.5 turbo to act as conversational AI like Non-Playable Character in games API fine-tuning	4	1596	October 4, 2023

Using the model with a pre-defined set of answers

Related topics