I am getting some random non related answers insted of getting fine-tuned data

admin115 · January 23, 2024, 5:04am

I fine-tuned my data. Then I tried to test my model in the playground but it generated some random answers even when I selected my tuned modes.

The following is my tuned data.

{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is Drive Ready?"}, {"role": "assistant", "content": "Drive Ready is a program designed to help attendees familiarize themselves with the technology and coding language of their choice. This hybrid mode of training ensures that every attendee is ready toace every interview they attend."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is Becomecoder?"}, {"role": "assistant", "content": "Training can emerge in many ways. Sometimes, it’s a simple starting point for a learner to upgrade their skills. Skill-up coder stands unique for its intermediate level of solving the coding programs which help them gain profound knowledge."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is SkillupCoder?"}, {"role": "assistant", "content": "The trainees get indulged in basic training on coding, related to the industry. The training aims at the student's overall development of coding skills as well as the comprehension ability to have better communication skills and lead successfully in the world of competition."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is owlcoder?"}, {"role": "assistant", "content": "Owl Coder is our signature training program that focuses primarily on helping any participant enhance their programming skills and master topics like data structures and algorithms, problem-solving, aptitude, reasoning, and soft skills. This event has proven to be successful for 3 consecutive years and has helped participants secure jobs in well reputed companies."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Where is the Becomecoder class?"}, {"role": "assistant", "content": "The owlcoder class is on the 5th floor hall one."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is Competitive Coding?"}, {"role": "assistant", "content": "Technical Hub organizes various training and skill enhancement programs that focus on keeping individuals up to date on various kinds of technology. This program grants everyone a chance to widen their knowledge and skills in various ways."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What are the different certifications providing by technicalhub?"}, {"role": "assistant", "content": "Technical Hub provides pieces of training that are certification and associated with various vendors that are top in the industry. Members who are part of this program are given constant practical examples on various topics, thus keeping them abreast of their technological world."}]}

And also I tried with API calls. The following is my post method,

import OpenAI from "openai";

const openai = new OpenAI({ apiKey: '' });

async function main() {
  const completion = await openai.chat.completions.create({
    messages: [{ role: "system", content: "Maya is a factual chatbot that is also sarcastic." }, {role: "user", content: "What is Becomecoder?"}],
    model: "ft:gpt-3.5-turbo-1106:aditya-engineering-college::8vjGh",
  });

  console.log(completion.choices[0]);
}

main();

For the above question I am getting non-related data like below

{
  index: 0,
  message: {
    role: 'assistant',
    content: 'The wonderful educational hub for those looking out to expand their horizon with a touch of hands-on tutorials is what Becomecoder is.'
  },
  logprobs: null,
  finish_reason: 'stop'
}

What should I do to get tuned data…? Can anyone please give me suggestions?

Innovatix · January 23, 2024, 6:05am

Aditya The reason you are not getting a sarcastic response is because your training data consists of a generic/informative tone.

The system prompt “Maya is a factual chatbot that is also sarcastic.” is good, but ensure that every training example aligns with this instruction. The system prompt sets the context for the model, so it should be consistent with the type of responses you want.

For example, in the response to What is Becomecoder?, the model’s output in your test seems to be a generic description rather than a sarcastic remark. This could be due to the training examples for similar prompts not strongly aligning with the sarcastic tone.

Like these 3:

admin115:

{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Maya is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

These two resources might help you with finetuning

https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset

admin115 · January 23, 2024, 7:10am

Yeah, I changed the system prompt related to my data as you said. But again I am getting nonrelated data. Is there any way to train my data to get better results? I just want a chatbot that should give the data that I have trained.

jr.2509 · January 23, 2024, 7:18am

@admin115 - when you say unrelated data, what exactly do you mean by that?

Finetuning is for style etc., not for content injection.

admin115 · January 23, 2024, 8:16am

@jr.2509 I got your point thank you for your valuable response.

Here I have the data related to my organisation I want to build a chatbot that should give exact answers related to my data. Is there any open AI model available to build our own chatbot?

jr.2509 · January 23, 2024, 8:41am

There are many ways to achieve this, most of which will involve some form of RAG system that will enable you store and retrieve information unique to your organization for the purpose of answering questions.

Check the OpenAI guidance on embeddings https://platform.openai.com/docs/guides/embeddings and Assistants https://platform.openai.com/docs/assistants/overview for two possible options. Depending on what type of user interface you had in mind, a custom GPT is also a possibility.

I suggest you have a look at these resources to get started. In the case of Assistants and custom GPTs you can easily create a first prototype to see if it achieves the desired outcomes and then go from there to add more complexity to it.

If you run into issues, check back here in the Forum with a new post to get further input.

Good luck!

admin115 · January 23, 2024, 9:11am

@jr.2509 Thankyou,
Can you suggest any channel for learning open AI embeddings?

jr.2509 · January 23, 2024, 9:14am

This OpenAI Cookbook resource should be a good practical start:

HomeRank · February 27, 2024, 2:03pm

JR- I thought that Fine-Tuning enabled content injection is you had novel data or data where general training was either faulty or incomplete…?

jr.2509 · February 27, 2024, 2:16pm

no, fine-tuning is not intended for content injection, rather to adjust how the model behaves and/or performs certain tasks. For content injection RAG solutions are the way to go.

See also here for fine-tuning use cases:

https://platform.openai.com/docs/guides/fine-tuning/when-to-use-fine-tuning

Topic		Replies	Views
Fine-tuning 3.5 turbo to act as conversational AI like Non-Playable Character in games API fine-tuning	4	1732	October 4, 2023
What's better for the type of chatbot I am building? Fine tune or embedding? Community chatgpt , api	10	2414	August 20, 2023
Fine tuned model not giving expected responses even after mentioning model id API fine-tuning , fine-tuning-problems	5	956	April 2, 2024
What is the best way of getting OpenAI API to respond with more specific & statistical responses related to Financial Markets? API fine-tuning-vs-rag	5	323	November 24, 2024
Knowledge through fine tuning or RAG embedding Community fine-tuning , rag	2	759	September 28, 2024

I am getting some random non related answers insted of getting fine-tuned data

Related topics