I have a set of Linkedin jobs and Some occupations, i'm trying to match the jobs with the most similar occupation/s

saranade.97 · May 29, 2023, 10:03am

I used the fine tune api on curie and my jsonl file looks like this:

{“prompt”:“similar occupation to software designer?\n\n###\n\n”,“completion”:" {"occupation_id": 2535, "occupation": "software architect\n"}“}
{“prompt”:“similar occupation to application architect?\n\n###\n\n”,“completion”:” {"occupation_id": 2535, "occupation": "software architect\n"}“}
{“prompt”:“similar occupation to software specialist?\n\n###\n\n”,“completion”:” {"occupation_id": 2536, "occupation": "software developer\n"}“}
{“prompt”:“similar occupation to software developers?\n\n###\n\n”,“completion”:” {"occupation_id": 2536, "occupation": "software developer\n"}“}
{“prompt”:“similar occupation to programmer?\n\n###\n\n”,“completion”:” {"occupation_id": 2536, "occupation": "software developer\n"}“}
{“prompt”:“similar occupation to application software developer?\n\n###\n\n”,“completion”:” {"occupation_id": 2536, "occupation": "software developer\n"}“}
{“prompt”:“similar occupation to software engineer?\n\n###\n\n”,“completion”:” {"occupation_id": 2536, "occupation": "software developer\n"}“}
{“prompt”:“similar occupation to Social Media and Marketing Coordinator?\n\n###\n\n”,“completion”:” {"occupation_id": 1973, "occupation": "online marketer\n"}“}
{“prompt”:“similar occupation to Social Media Marketing Intern?\n\n###\n\n”,“completion”:” {"occupation_id": 1973, "occupation": "online marketer\n"}“}
{“prompt”:“similar occupation to Social Media Marketing Intern?\n\n###\n\n”,“completion”:” {"occupation_id": 1973, "occupation": "online marketer\n"}"}
…
…
& so on

I have a static number of occupations in my db, each with an id.

Now i’m using the completion api on my fine tuned model, but it’s not accurate and most of the time the returned id is not correct and doesn’t belong to the occupation returned. It sometimes returns an occupation that doesn’t exist in my training data. I need it to return an occupation from the ones i provided in the training data with the correct id.

Any idea what i’m doing wrong or what can be improved?

kevin6 · June 1, 2023, 11:06am

Provide more training examples to cover more of the possible occupations. The model can only return occupations it has seen in the training data.
Use logit biasing to upweight the log probabilities of known occupations and downweight unknown occupations. This makes the model much more likely to return occupations from the training set.
Truncate the model’s output to a fixed, known list of occupations. For example, only return one of the top 5 most probable occupations, where those top 5 are selected from the training occupations.
remove the occupation_id

Topic		Replies	Views
Fine tune model auto complete label category - how to stop this? API	5	712	December 20, 2023
Fine tune model auto complete label not from train list - how to stop this? Prompting gpt-4	10	1467	December 20, 2023
Finetuning experiments (How not to finetune) API fine-tuning	2	776	December 21, 2023
Fine tuning doesn't bring relevant completions API fine-tuning	4	724	December 24, 2023
Finetuned Classification providing invalid response as classification Prompting	5	969	November 8, 2022

I have a set of Linkedin jobs and Some occupations, i'm trying to match the jobs with the most similar occupation/s

Related topics