How to Limit question results to proprietary dataset?

I am exploring integrating GPT into our own chat sessions based on our own custom knowledge base.

I’m using LlamaIndex to generate an index of the custom knowledge base in my company, then asking questions about the material. This works well, but it still answers unrelated questions. For example if I use the index to query I can say “What is the diameter of the moon?” and it will still answer it.

Obviously I don’t want these type of questions to be answered in our business chat sessions. How exactly do I limit the scope of the conversation? It should instead say something like “Based on my research i’m not able to find an answer to your question” or something like that which I can program.

3 Likes

Several proposals. In my experience, you get the best behavior when you actually combine all of them:

  • Clearly specify the questions that should not be answered via prompt-engineering. Stuff such as “You should always refuse to answer questions that are not related to this specific domain” should help a lot.
  • Include binary classifiers that determine whether a question is “on-topic” or “off-topic” for your particular use case. You can use cheap fine-tuned OpenAI models for this or open source stuff (Huggingface).
  • Include a minimum threshold of similarity when retrieving documents to answer questions. If no documents surpasses this threshold, decline to answer politely (with a pre-specified formula).
  • Use content moderation (OpenAI’s free endpoint) to filter out inappropriate requests.
  • Include reg-exp filtering to add a extra security layer to stuff such as prompt-injection (especially if you’re exposing your app to external customers).
  • Probably many others :slight_smile:

Hope that helps!!

8 Likes

Yes, thank you very much, I will explore all of these options!

1 Like

One thing that I can mention here is that there is the concept of word embeddings. This is where you take the data that is available for your business idea or content And you create word embeddings. You then store that information in a Vector database in the form of word embeddings. What they are basically word tokens that map to vectors which are spacial mappings to the model’s token positioning. It is a lot to mention here but then you can search your db for the context and then if it is found then you pass the same search to openai chatgpt passing also the embeddings in your local db to the same prompt. So it then give you granular control over the responses provided. If they go off topic, because it is not in your context, you you respond with some answer that says that. So basically you take your context convert it to openai’s word embeddings and then save that. Then you have a subset of their model that you run the prompts through. You will need to do more research into this to understand this and learn how to do this. Here is a link: How to prevent ChatGPT from answering questions that are outside the scope of the provided context in the SYSTEM role message? - #3 by caos30

1 Like

How could I achieve that? In the chat prompt or in the training document (messages ->role)?
Thanks

@Sbgal - It would be system prompt in this case and you could set the params while creating a chat completion or assistant.

Thank you. Can that configuration be changed at any time after launching the chatbot?

@Sbgal - If you would like to change config params during runtime, you could write logic to modify your assistant or completions when need be.

I have configured the parameters in Python as follows, but I am not sure I can change them in the future in the promp when configuring the front-end

import openai
import json
import time

Configura tu clave de API de OpenAI

from openai import OpenAI

client = OpenAI(api_key=‘sxxx’)

Define el archivo a cargar

file_path = ‘formatted_training_data.jsonl’

Sube el archivo a OpenAI utilizando la nueva API

with open(file_path, ‘rb’) as f:
response = client.files.create(file=f, purpose=‘fine-tune’)

file_id = response.id

print(f’Archivo subido con éxito. ID del archivo: {file_id}')

Inicia el fine-tuning utilizando la nueva API

fine_tune_response = client.fine_tuning.jobs.create(training_file=file_id, model=“gpt-3.5-turbo”)

fine_tune_id = fine_tune_response.id
print(f’Modelo de fine-tuning iniciado: {fine_tune_id}')

Función para verificar el estado del fine-tuning

def check_fine_tune_status(fine_tune_id):
response = client.fine_tuning.jobs.retrieve(fine_tune_id)
status = response.status
return status, response

Loop para revisar el estado del fine-tuning

while True:
status, response = check_fine_tune_status(fine_tune_id)
print(f’Status del fine-tuning: {status}')
if status in [‘succeeded’, ‘failed’]:
break
time.sleep(60) # Espera 60 segundos antes de volver a comprobar

Si el fine-tuning ha fallado, imprime los detalles del error

if status == ‘failed’:
error_message = response.error.message if response.error else “No se proporcionó un mensaje de error.”
print(f’El fine-tuning falló. Detalles del error: {error_message}‘)
fine_tuned_model = None
else:
print(f’Modelo de fine-tuning completado con éxito. ID del modelo: {response.fine_tuned_model}’)
fine_tuned_model = response.fine_tuned_model

Ejemplo de interacción con el modelo fine-tuned con límite de tokens

def interactuar_con_modelo(prompt):
if fine_tuned_model is None:
return “El fine-tuning falló, no se puede interactuar con el modelo.”

response = client.chat.completions.create(
    model=fine_tuned_model,
    messages=[
        {"role": "system", "content": "Lorem Ipsum."},
        {"role": "user", "content": prompt}
    ],
    max_tokens=120,
    temperature=0.6,
    top_p=0.8
)
return response.choices[0].message.content

Ejemplo de uso

prompt = “¿Qué es la inteligencia artificial?”
respuesta = interactuar_con_modelo(prompt)
print(f’Respuesta del modelo: {respuesta}')

You could assign them to variables and dynamically update these on the front-end as per your use case.

1 Like

Thank you. I have already understood it. P.S. Could you please edit the descriptive text of “content” that was mistakenly included in the original message?

When converting the Json format into JsonL, some instructions are added to the “content” of each document, which are the same instructions that are added to the front-end that sends the request to the API. Is this duplication necessary? Are these instructions necessary for training and also for the front-end?:

import json
#Read training data from JSON file
with open(‘training_data.json’, ‘r’, encoding=‘utf-8’) as f:
training_data = json.load(f)

Format the data for the training request in JSONL

with open(‘formatted_training_data.jsonl’, ‘w’, encoding=‘utf-8’) as f:
for entry in training_data:
formatted_entry = {
“messages”: [
{“role”: “system”, “content”: “Instructions.Do I go here for training or on the front-end?”},
{“role”: “user”, “content”: entry[‘prompt’]},
{“role”: “assistant”, “content”: entry[‘completion’]}
]
}
f.write(json.dumps(formatted_entry, ensure_ascii=False) + ‘\n’)
print(“Archivo guarado como ‘formatted_training_data.jsonl’.”)