How to force the assistant to add links in the answer only from the list that I attached to it in the text file?

I have attached a json file with links in this form.

 {
 "Topic": "Podcast - Building a Culture of Automation",
 "Link": "youth be 12345",
 "Type": "Video"
 },
 {
 "Topic": "Homepage",
 "Link": "example com",
 "Type": "Page"
 },

And I want that when my chatbot is asked about a topic, it would find a link from this file (and there are about 100 links) and give it at the end of the answer (or for the link to be the answer).

Sometimes it is, but it happens that he gives links to other websites, or invents non-existent links.

1 Like

are the links using “https” protocol?

Are you using the assistant api?

I do something similar and when I ask for link it usually work great. It needs to be explicit in your prompt that you want to output your links

yes, I’m using assistant API.

And I have such instructions:

You are a personal Virtual Assistant.
You need to follow this instructions:
…bla.bla.blah

  • Never suggest seeking information from elsewhere.
  • Provide links ONLY from your data, from the attached source file, if required, or if appropriate to the topic of the question.
1 Like

I think your format isnt clear enough try putting

“Link”: “https://example.com”

I think the AI doesnt understand it is a link

There are correct links in my file
2024-07-27_18-13-14
This forum doesn’t allow me to send links ¯_(ツ)_/¯

And it doesn’t send me any links in response

The problem is the RAG system doesn’t always know what to do with JSON file data. You will need to do one of the following:

  1. create a tool that can return the data to the assistant.
  2. upload the file to the python code interpreter and give it system instructions to open the file and use the information from within it.

Hi @Darkangel290

I’m facing the same issue, did you solved it?

Thanks
Noam

To build on @nicholishen suggestion… I would send markdown instead of JSON. This will give you better control over where your text is split at by their RAG system. So instead of JSON, upload a file structured like this:

## About ZAPTEST
Link: https://youtu.be/I2MtwdPCTS8
Type: Video

## interview with CEO
Link: https://youtu.be/SSjrbeW_J3s
Type: Video

The ## headers create clear break points in your file so that its more likely the RAG system will return whole records when generating text chunks.

The other thing I’d recommend is including the descriptions if you have them. This will result in better similarity matches when their RAG system queries your file. The key thing to understand is that when using RAG they’re not showing your whole file to the model. They’re only showing it part of the file so the more descriptive like text you include, the more likely they’ll select the parts of your file that best match what the user is looking for.

2 Likes

Thanks @stevenic! I’ll change it to markdown and share my insights

1 Like