Welcome, Peter. Thanks for your reply.
The Discourse Forum doesn’t allow replicating entire messages across the threads. Please check the thread below for more details:
Seeking Advice on Handling Large Vehicle Database for AI Chatbot Application
Correct. The dataset shall be uploaded to a storage service of your choice - such as Amazon S3, Google Cloud Storage, or Microsoft Azure Storage. Check the availability of the dataset for the model. Using your example:
Systemrole orUserprompt:
Please use the dataset located at http://domain/mytextfile.txtPythoncode usingopenaiAPI:
...
# Set the URL of your dataset
dataset_url = "http://domain/mytextfile.txt"
...
# Set the prompt to use with the model
prompt = "Consider the following dataset and provide your responses accordingly: " + dataset_url
...
I like the free text format better because we can write texts and lists in a more natural way, without worrying about excessive formatting, syntax, etc. And the text format allows greater flexibility to pass instructions, example templates, authors, licenses, and descriptions - that do not belong to the dataset itself - this additional or auxiliary information or metadata can be passed to the model in the dataset header as a separate section.
It can be in free format, JSON format, or whatever you like - the model will understand - since it is clear for the model.
In the case of free text format - I am curious about the use of ## symbols - I would like to remind you that the models are sensitive to delimiters in order to separate the prompt texts and datasets into different contexts such as data, instructions, metadata, etc. - and # is used as comment mark in Python.
Even if you don’t use Python, the openai API is used in Python code like most user-made apps, hence the model could be confused about ## after the temperature or the condition (if you accepted the suggestion).
If you want to show the termination of a single data point, then it’s better to use a semi-colon ( ; ) as in my example in my previous post. Delimiters and consequently punctuation are very important to the models.
I hope this helps. Please, let me know the results
Oh, you just edited your prompt
- while I was replying.