Pre trained model getting wrong answer

I have created my own trained model and when I request api with trained model, I am getting wrong answer or non English ( like chines) response.

This is my php code.

$apiKey = 'xxxxxxxx';
$apiEndpoint = '';

$headers = array(
    'Content-Type: application/json',
    'Authorization: Bearer ' . $apiKey

// Set the prompt and fine-tuned model
$payload = array(
    'prompt' => $userMessage,
    'model' => 'ada:ft-xxxxxxx-2023-07-11-15-35-25' 
// Convert the payload to JSON
$jsonPayload = json_encode($payload);

// Send the API request using cURL
$ch = curl_init($apiEndpoint);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $jsonPayload);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

// Execute the request
$response = curl_exec($ch);

I used below sample jsonl for fine-tuning

{"prompt":"about company\n\n\n###\n\n","completion":" my company details xxx"}

{"prompt":"location - Sheikh Zayed Road\nName - xxx Trading Est\nAddress and Phone - xxxxxxx, +1 (x) xxxxx\n\n###\n\n","completion":" "}

Your fine tunes don’t look like actual user input.

about company\n


my company details xxx


location - Sheikh Zayed Road
Name - xxx Trading Est
Address and Phone - xxxxxxx, +1 (x) xxxxx



I suppose you’ve anonymized them for posting here, but it keeps us from telling more faults, like in the first, do you have for all responses the same stop?:

Each completion should end with a fixed stop sequence to inform the model when the completion ends. A stop sequence could be \n, ###, or any other token that does not appear in any completion.

Let’s take a typical conversation input and response, that your chatbot software formats into the instruction following:

"This is the AI assistant of ABC trading company answering a user question:

User: Hi there. I’d like to know where your company is located.
AI assistant:"

“Great question! We are located on Sheikh Zayed Road, Dubai. Look for our entrance just across from the Hotel. Is there more you’d like to know?”

You have to pick a fixed separator at the end of the prompt - and you can see we’ve done that with “\nAI assistant”

Also that fixed stop sequence. The API can recognize that and end completion. Lets use \n\n\n\n because it also has semantic meaning to the AI.

Then we can form the JSONL:

{"prompt": "This is the AI assistant of ABC trading company answering a user question:\n\nUser: Hi there. I'd like to know where your company is located.\nAI assistant:", "completion": " Great question! We are located on Sheikh Zayed Road, Dubai. Look for our entrance just across from the Hotel. Is there more you'd like to know?\n\n\n\n"}

You can see I’ve included the main prompt instruction. That can be useful if you have a few different ways to use AI but only want one fine tune, so you don’t have to repeat your proprietary training data included in the instruction following.

And finally, it will take tons to add weight to the way the AI responds by default. You can start with a curie model if the AI doesn’t need superior reasoning skills, and there will be less pre-training to overcome than davinci. Ada really doesn’t have enough language skill to present to an end user, especially if not English.

Thank you @_j ,
To effectively train the model using our company and product details, we need to incorporate information about our various branches, including their respective phone numbers, email addresses, and physical addresses. To achieve this, I have prepared a JSONL file that follows a contextual format. This format allows for easy replacement of placeholders, such as names and addresses. Here’s a sample entry from the JSONL file:. This is the sample -

{"prompt": "location - [Location Name]\nName - [Branch Name]\nAddress and Phone - [Address], [Phone Number]\n\n###\n\n", "completion": " "}

Additionally, we have more than 500 products, each with a unique name, description, and price. can you suggest a best method with dummy jsonl content?.

You can’t train the AI that way - you are only teaching it that [Location Name] is the correct type of output to produce.

There’s nothing wrong with 200 different ways to ask about 20 different locations.

The fine-tuning also is not the way to inform the AI of specific company data. It’s just a slight push in the right direction.

You should use an embeddings-based database for information retrieval relevant to the customer’s question. For that, you’d want to completely transition to a gpt-x model that’s already been trained how to call functions.

And its also the better way: are you going to train a new AI when a product is dicontinued?

1 Like