Need clarification regarding One-Shot model fine-tune

Please help me to understand this example of fine-tuning One-Shot model:
https://github.com/openai/openai-cookbook/blob/main/examples/fine-tuned_qa/ft_retrieval_augmented_generation_qdrant.ipynb

If I understand it correct, this model is using existing text chunks (contexts) and questions. Every dataset sample for fine-tuned model should have 8 rows as defined in function get_few_shot_prompt:

  • system,
  • user: question+context,
  • assistant: answer,
  • user: question+context,
  • assistant: “I don’t know”,
  • user: question+context,
  • assistant: answer,
  • user: original question + original context

Questions:

  1. Why do OpenAI fine-tune the model with samples without the final assistant role?
  2. When they try-out the fine-tuned model in the next paragraph, should the prompt be built with the same function get_few_shot_prompt?
    If it does, where do the end user should take the context from?
# Let's try this out
completion = openai.ChatCompletion.create(
    model=model_id,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "Can you answer the following question based on the given context? If not, say, I don't know:\n\nQuestion: What is the capital of France?\n\nContext: The capital of Mars is Gaia. Answer:",
        },
        {
            "role": "assistant",
            "content": "I don't know",
        },
        {
            "role": "user",
            "content": "Question: Where did Maharana Pratap die?\n\nContext: Rana Pratap's defiance of the mighty Mughal empire, almost alone and unaided by the other Rajput states, constitute a glorious saga of Rajput valour and the spirit of self sacrifice for cherished principles. Rana Pratap's methods of guerrilla warfare was later elaborated further by Malik Ambar, the Deccani general, and by Emperor Shivaji.\nAnswer:",
        },
        {
            "role": "assistant",
            "content": "I don't know",
        },
        {
            "role": "user",
            "content": "Question: Who did Rana Pratap fight against?\n\nContext: In stark contrast to other Rajput rulers who accommodated and formed alliances with the various Muslim dynasties in the subcontinent, by the time Pratap ascended to the throne, Mewar was going through a long standing conflict with the Mughals which started with the defeat of his grandfather Rana Sanga in the Battle of Khanwa in 1527 and continued with the defeat of his father Udai Singh II in Siege of Chittorgarh in 1568. Pratap Singh, gained distinction for his refusal to form any political alliance with the Mughal Empire and his resistance to Muslim domination. The conflicts between Pratap Singh and Akbar led to the Battle of Haldighati. Answer:",
        },
        {
            "role": "assistant",
            "content": "Akbar",
        },
        {
            "role": "user",
            "content": "Question: Which state is Chittorgarh in?\n\nContext: Chittorgarh, located in the southern part of the state of Rajasthan, 233 km (144.8 mi) from Ajmer, midway between Delhi and Mumbai on the National Highway 8 (India) in the road network of Golden Quadrilateral. Chittorgarh is situated where National Highways No. 76 & 79 intersect. Answer:",
        },
    ],
)
print("Correct Answer: Rajasthan\nModel Answer:")
print(completion.choices[0].message)

I think, this is incorrect.

  1. The samples for fine-tuning the model should end with an “assistant” role with an answer, generated by the regular 3.5-turbo model
  2. The try-out sample should end with an “user” role with an original user question without context.

This is what displays the assistant’s reply. What you need to do is to capture completion.choices[0].message in a string (review its quality manually to improve the trained model) and add it as the assistant’s message to your messages object in training file.

This notebook has too many lines to know in particular where you see the problem you mention. You could let us know the cell where the issue appears to you.

The issue to me is the same seen in every other OpenAI “fine-tune” demo: They are fine-tuning on nothing the AI doesn’t already know or can’t do. The test set from 2016, before transformer AI was invented, is supposed to be reading comprehension of a paragraph from Wikipedia with little context to where it was pulled from. Congrats, by training on that poor data, they made an AI that refuses to answer 40% of the time.

Example #96 produced by the code for untrained 3.5, ID “5a8937553b2508001a72a52b” (which I picked simply because it is in the four results of the 100 printed by the pandas markdown output into VSCode)

  • Question: “What type of value would the Basel function have if there were finite primes?”

The test set says a plausible answer is “a finite value”, yet also says the question is “impossible”. However the answer is right in the brief context: “if there were only finitely many primes then ζ(1) would have a finite value”.

Let’s ask plain ChatGPT 3.5 the question alone (then telling it to rewrite the answer with LaTeX notation in MathJax formatting for the forum).

The Basel problem is a famous mathematical question that asks for the exact value of the infinite series:

\sum_{n=1}^{\infty} \frac{1}{n^2} = \frac{1}{1^2} + \frac{1}{2^2} + \frac{1}{3^2} + \frac{1}{4^2} + \ldots

This series is known as the Basel series. It’s interesting because it converges to a finite value despite having an infinite number of terms. The problem was first posed by Pietro Mengoli in 1650, and it remained unsolved for many years. In 1734, the famous Swiss mathematician Leonhard Euler made a significant breakthrough by finding the exact value of this series, which is

\sum_{n=1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}

where (\pi) is the mathematical constant representing the ratio of a circle’s circumference to its diameter.

Now, let’s address your question regarding the Basel function and finite primes. The Basel function itself relies on the sum of the reciprocals of the squares of natural numbers, as shown above. It is essential to understand that the convergence of this series doesn’t depend on the availability of finite or infinite primes; it depends on the nature of the series itself.

The Basel series converges to (\frac{\pi^2}{6}) because of its mathematical properties, specifically, it is a well-known example of a convergent p-series with (p = 2). The convergence of the series is independent of whether there are a finite or infinite number of prime numbers. Prime numbers influence various areas of number theory and mathematics, but they are not directly related to the convergence of the Basel series.

In summary, the value of the Basel function remains (\frac{\pi^2}{6}) regardless of the number of prime numbers, finite or infinite, because it depends on the mathematical properties of the series itself and not on the availability of prime numbers. The relationship between prime numbers and the Basel function is indirect, primarily through the study of number theory and the properties of the Riemann zeta function, which Euler used to solve the Basel problem.

I understand that training test sets for RAG or fine-tune demonstration, containing examples similar to private corporate knowledge outside of AI corpus, may be hard to come by.

If you made the AI not able to answer Wikipedia questions by a “hallucination improvement” of following the test set guidance like this cookbook example though, or trained the AI to answer in only three words, you made it much dumber.

So actual applications that demonstrate how you can and would need to make the AI answer better, and why you’d need to fine-tune simply to use RAG in chat roles, remain elusive.

Paying for 10 or 100 or 1000 tokens of “here is documentation to enhance your knowledge” prompt training to the normal model is certainly a better value than paying eight times as much for every RAG injection and every response.