We created fine-tune model for our internal purpose. The model is build on top of documents that we have. Our expectation is to use fine-tune model to respond employee question based on the content we trained for it. But model is giving most of the response based on publicly available content. So help me, how to train/restrict model to provide response based on trained content.
Your fine-tuning now is essentially the same as other knowledge and has the same effect on possible token generation.
Follow this link to find the problems with your approach. OpenAI has encouraged rather than warned away people from mis-application of fine-tuning in recent announcement and documentation.
That will explain fine-tune vs a database-driven retrieval of relevant information for answering within a specific new knowledge domain.
A model can leverage external sources of information if provided as part of its input. This can help the model to generate more informed and up-to-date responses. For example, if a user asks a question about a specific movie, it may be useful to add high quality information about the movie (e.g. actors, director, etc…) to the model’s input. Embeddings can be used to implement efficient knowledge retrieval, so that relevant information can be added to the model input dynamically at run-time.
A text embedding is a vector that can measure the relatedness between text strings. Similar or relevant strings will be closer together than unrelated strings. This fact, along with the existence of fast vector search algorithms means that embeddings can be used to implement efficient knowledge retrieval. In particular, a text corpus can be split up into chunks, and each chunk can be embedded and stored. Then a given query can be embedded and vector search can be performed to find the embedded chunks of text from the corpus that are most related to the query (i.e. closest together in the embedding space).
Thanks for information. It is really helpful information and will be used while evaluating possible solution.
The general rule of thumb has, and continues to be,
- Fine-tuning: How to respond
- Embedding: What to respond
It’s useful to remember that models are stochastic representations of the relationships between big groups of words.
So, when you fine-tune a model you’re changing how the model understands those relationships.
This is useful if, let’s say, we’re using a large language model from 1985. All of its training data understands the word “window” as a hole in the side of a house, so asking it how to “drag a window to the side of the screen with a mouse” is going to confuse the hell out of it.
Because it has a very concrete understanding of the words “window” and “screen” and their relationship so it will probably do its best to explain to you ways in which you might try to coax a mouse into closing a window.
Fine-tuning on a sufficiently large corpus of training data will give it additional understanding about what “window,” “screen,” and “mouse” mean in this context by changing the internal weights associated with the relationships of those words along with words like “computer.”
But, no amount of fine-tuning is going to get the model to reliably spit out, verbatim, the text of your Microsoft Windows manual.
To do that you would want to use vector embeddings to find relevant passages in the documentation and load that content into context so the information is immediately available at the front of the LLMs “mind.”
That said, combining a fine-tuned model with Retrieval Augmented Generation (RAG, with vector embeddings) can be a very potent combination, especially if care is taken to meticulously craft a great system message.
This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.