Why are "Training loss" and "Validation loss" so high

Hi There!

I’m trying to figure out two trained models.

The first model was trained on 800 records, and 100 records were used as validation.

The context of the training and validation records is taken from a single dataset.

There are only 3 epochs, by default, as I understand it, I did not choose anything forcibly.

The result, as I see it, is not very good.

Today I decided to train the model again, but with 10 epochs.

The result became even worse.

1 Like

What kind of data are you fine tuning with may I ask?

Question and answer (patient - doctor)

The real piece on which the training is taking place.

    "messages": [
            "role": "system",
            "content": "General Guidelines:\n- You are a virtual mental health companion.\n- Respond with \"Buddy\" when asked for your name.\n- Respond to user inquiries with empathy and understanding.\n- Provide non-judgmental support.\n- Offer resources, guidance, and coping strategies.\n- Incorporate self-care practices into conversations.\n\nUser Interaction:\n- Start each interaction with emphatic greetings.\n- Use open-ended questions to encourage sharing.\n- Provide empathetic responses and active listening.\n- Acknowledge and validate users' emotions and experiences.\n\nResources and Recommendations:\n- Offer personalized recommendations based on user needs.\n- Provide information, links, and helplines related to mental health.\n- Offer tips, techniques, and strategies for self-care.\n- Optionally offer an initial screening for counseling services.\n\nCounseling Services:\n- Provide information on counseling sessions.\n- Offer scheduling options or external counseling services.\n\nCrisis Management:\n- Recognize signs of crisis.\n- Respond immediately to distressed users.\n- Provide intervention resources for crisis situations.\n- Stay calm and reassuring during crises.\n\nEthical Guidelines:\n- Protect privacy and confidentiality.\n- Adhere to ethical guidelines.\n- Avoid deceptive, manipulative, or argumentative behavior.\n\nLimitations:\n- Do not provide medical diagnosis.\n- Do not promote harmful behavior.\n- Avoid controversial topics.\n- Redirect non-mental health inquiries to reliable sources like medlineplus.gov."
            "role": "user",
            "content": "Suggest Remedies To Manage Anxiety Along With Lexapro Dosage\n\nMy father suffers from anxiety, he is on 10mg of Lexapro daily but still has anxious feelings on days after he has a little extra activity than usual. Is there any thing in addition you could recommend not necessarily another medication, Example meditation, etc."
            "role": "assistant",
            "content": "Hello,\n\nYour father is suggested to take a minimum of 8 hours quality sleep. Inadequate sleep triggers anxiety. He should give some time in the morning as well as in the evening to take a walk and relax by listening to some good music or meditating. Aroma therapy is also good to reduce anxiety. It will be good if he gets connected to a physical therapist, CBT and massage therapy works well to help people get relaxed. He should take berries, fruits and steamed veggies. Green tea, almonds, yogurt, eggs, etc., help to reduce anxiety. He should avoid smoking, alcohol, caffeine, drugs, etc. They can trigger the symptoms of anxiety.\n\nTake care. Hope I have answered your question. Let me know if I can assist you further."
1 Like

Did the 100 records of validation pass? Your training data seems like the kind of thing that can be used for fine tuning though not sure why you get such a high training loss.

Hope someone from OpenAI can answer!

For comparison I am fine tuning a kind of classifier system that inputs plain english and outputs structured JSON and I get a training loss of 0.x sometimes 0.00x for 500 training examples.

1 Like

The validation data is the same, only the question and answer are different.

Ok did you notice a result? I wonder if your use case is quite broad in terms of textual output structure which could explain why such a high training loss.

Hi there!

It looks like you are intending to use fine-tuning for Q&A. Unfortunately, this is not what it is intended for - fine-tuning within this specific context - is primarily designed to get a model to behave in a certain way, e.g. adopt a certain style in output or approach a task in a specific way. It is not a recommended way to get the model to absorb specific knowledge.

As a similar topic came up just a few days ago, I’ll refer you to another thread where I have discussed this at greater length along with references to other resources.

Bottom line is that you should be looking at retrieval augmented generation (RAG) approach for your use case.

Let us know if you have any follow-up questions once you’ve had a chance to review the other thread.

Good luck in any case.


Got it, thank you for your response!

1 Like