Thank you. I had initially read to convert my .txt files into .jsonl files, but this also failed for me.
Representing the data as .json has allowed me to upload my data successfully.
Here is the format I am using:
[
{
"prompt": "a string representing a prompt",
"completion": ["response data 1", "response data 2"]
},
{
"prompt": "etc",
"completion": "you can also pass a single completion string as opposed to an array"
},
...
]
I have no idea if this format is good, but the models do appear to be able to pull specific data, appearing that it has learned.
From experimenting for only a few minutes so far, it appears that the GPT Assistants are not even learning from the data, but essentially just searching the data and then supplementing prompts with this extra bit of text. Not to undermine how cool this technology is, but if you’re trying to actually fine-tune a model and account for custom data to yield a new model with unique weights, I don’t believe this is what you’re looking for; however, the results may still be better than what you can get with other available models out there, so it’s all very interesting regardless! I have no idea if what I am saying here is true, just my first thoughts from playing with it just now.
Users on the forum are discussing a recurring issue related to file uploads when creating or updating assistants via the Playground or the API. rayed_studded, srajan.garg, nucks, hongfei.dong.v, masterjohnjedi2321, and dev29 report errors related to unsupported file types, including CSVs, PNGs, and JPGs, suggesting it may be an issue on OpenAI’s side.
bsquires14 indicates the problem may be that the system expects is the JSONL file format, and even that does not process successfully. There’s also a problem with deleting troublesome files. However, bsquires14 provided a workaround that implies using the API to upload files as a temporary solution and a Python script to achieve this.
Updating the findings, bsquires14 stated that the problem seems to be tied specifically with the ‘Retrieval’ functionality. Disabling ‘Retrieval’ and using the Code Interpreter as an alternative, although not as effective, makes it work.
mstockwell and afedechkin voice concerns about the issue and its implications, subscribing to the thread for future updates.
shaunak.tulshi mentions the same error while trying to upload either CSV or TXT files, even though the error message states that these formats should be supported. A workaround proposed includes converting CSV files to JSON.
mikebell180 provides a working solution in Python Flask code to handle file uploads and convert them into conformable CSV format based on a database schema.
Finally, ryan30 shares their success by converting .txt files to .json and raises questions about how the models are working with data - whether they’re learning from or merely searching it.
You have to turn on Code Interpreter, load the file on file panel (not the prompt clip) or retrieval and indicate the file ID on prompt for assistant to read it.
Today json files stopped working for my assistants. Yesterday everything was fine, but now I can’t even attach files to the assistant. The error is: Failed to update assistant: UserError: Failed to index file: Unsupported file file-DTuBcdZDvjlzcTMPBUCDgPBD type: application/octet-stream error_code: unhandled_mimetype.
If I enable the Code interpreter function, the file is attached without error, but it seems that the assistant does not see these files and cannot read them.
Is this a glitch on Openai’s part? I just haven’t changed anything since yesterday. (Yesterday, the assistant ONLY had the retrieval function enabled and I could easily add new json files)
UPD:
I found what the problem is -
If the following array exists in the json file, then the error that I wrote about above occurs:
"example": [
"e1",
"e2",
"e3"
]
To avoid errors, this structure should be written like this: "example": ["e1", "e2", "e3"]
This is very strange, because, as far as I know, both options are correct for json. This feature appeared quite recently, since until yesterday all my json files were written as usual (first option example)
I’m having similar issues. I’m using the API and the model is chat gpt 4 turbo preview. I enabled the code interpreter and disabled the retrieval. I can upload .txt files just fine and the AI is able to read and analyze it. But, when I upload a .csv or .xlsx files, the AI can’t interpret it. It always says that it’s having technical issues that prevents AI from analyzing the contents. Is there an official bug report about this issue? I’ll appreciate it so much if someone can give me the link. I’m fairly new to this AI thing, so any kind of help is much appreciated. Thank you!