Looking for help with prompt optimization!

Hi All!
I’m currently working on building an automation that uses openAI to classify students into categories. We do this based on their LinkedIn profile data. I already have a prompt that works fine with Davinci 2.0 but I was looking to use a less expensive model and get the same success rate. This should be possible with Curie, as this should work fine for classification purposes.

Is there anyone that is willing to help? We might be able to arrange a compensation!

Send me a PM or an email to daan@fiks.nl

Thanks!

A finetuned CURIE or even BABBAGE should be able to handle basic classification very easily.

1 Like

Thanks for the suggestion! How many examples per category would you suggest if I want to do something like this: The prompt looks like this now:

Categorizing students into a category based on the extracted field of study. Solely use the following categories: Business, Marketing, Computer science, Design, Facility management, Human Resources, Engineering, Finance, Law and Other. If the extracted field of study is not available, the student falls in the other category. Using new categories is not allowed. Students can only fall in 1 category.

Student profile description: Studying communication science

Extracted field of study: communication

Category: marketing

---

I would simplify this a lot for finetuning.

Prompt:

[unstructured student profile without any changes]
Category:

Completion:

[desired category]

It looks like you have 10 categories and the recommended minimum for finetuning is 200, so that means you need 20 samples per category. Honestly you could probably get away with 5 samples per category since this is such a basic task.

1 Like

Personally, I’d start with at least 200 samples. If that doesn’t work, double it to 400, etc. Good luck!

1 Like

Alright! And just to confirm, this can be done with the completion api right? Or should I use a classification like: OpenAI API

If you know the categories in advance, you should use classification endpoint. Completion can work but is not the right tool for the job.

Just be aware of the caveat, that if you have more than 200 examples, you will need to upload it as a file, and step 1 is a keyword search (which filters out even small variations).

1 Like

I haven’t worked with the classification endpoint yet, do you think it’s much of a difference with the completion endpoint? Or is it worth the time getting to know the classification endpoint? As it took me some time to get my first finetune with completions…

Finetunes is way more complicated than classifications. Classifications is simple. Give it a go. We can help if you get stuck anywhere.

You can get your feet wet with this example from the docs:

import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
resp = openai.Classification.create(
  search_model="ada",
  model="curie",
  examples=[
    ["A happy moment", "Positive"],
    ["I am sad.", "Negative"],
    ["I am feeling awesome", "Positive"]
  ],
  query="It is a raining day :(",
  labels=["Positive", "Negative", "Neutral"],
)

Once you get this up and running (inspect resp variable after execution); you can then trivially modify.

1 Like

Hi!
I tested the model with examples from a file. It did work quite okay but we will need more than 200 examples. Do you know if I can just upload the file and use it with the file parameter, or should I make a fine-tuned model and have the same file as a parameter?

I’m not sure if we need a finetuned model if we want to use more than 200 examples.

Thanks!

You do not need to finetune for this use case. FineTune will generally be a later optimization step if at all, to optimize costs etc.

If the results are ok with a few shot paradigm (examples in a file), I think that your use case is within the domain of applicability of the general model.

1 Like

Alright, that’s great!
I hope it will work better if I use a file with around 300-400 examples. But it’s already around 80% succes rate I guess.

Once you have a sufficient number of examples (about 5-6) for each category, adding examples that the model gets wrong with the right category helps the model work better. Rather than adding more examples, i.e., you should optimize for examples that give you the most bang for the buck.

Let us know how that goes. All the best.

1 Like