How to upload files? When I using 'create assistant API'?

When I call the API of Create assistant, what format does the file corresponding to the parameter file_ids need to be in? Or when Upload file API uploads files with purpose set to Assistants, what format should the content of the file be in?

The current situation is that I know the fine-tuned file types and formats:
File type: jsonl
Content data structure:
{“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “What’s the capital of France?”}, {“role”: “assistant”, “content”: “Paris, as if everyone doesn’t know that already.”}]}
{“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “Who wrote ‘Romeo and Juliet’?”}, {“role”: “assistant”, “content”: “Oh, just some guy named William Shakespeare. Ever heard of him?”}]}
{“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “How far is the Moon from Earth?”}, {“role”: “assistant”, “content”: “Around 384,400 kilometers. Give or take a few, like that really matters.”}]}

Are the formats and data structures of all uploaded files the same?Although the purpose of the document is different.

Use the files API: https://platform.openai.com/docs/api-reference/files/create

It returns the file_id that you then use to create the Assistant.

Yes, but what I want to know is the document types and how the content is organized?Is it also a jsonl file?

The Assistant is a black box. You can upload files of many different types (they are listed in the docs). The API scans and indexes the files, but it’s not clear how. The results are generally very good. I’ve given it about 2Gb of PDFs and it mostly responds correctly. A couple of people have notices that it seems to misread columns of data from PDFs, but on the whole it’s pretty good!

File Size Limits Docs: https://platform.openai.com/docs/assistants/how-it-works/creating-assistants
File Types Docs: https://platform.openai.com/docs/assistants/tools/supported-files

Note: I haven’t been able to get zips to work, but this may be because I violated the 2,000,000 token limit!

1 Like

Thank you for your reply, it has helped me a lot. I was confused because I didn’t see the description of the file type in the API. Your answer made me think clearly.

How do you get the file_ids parameter in [Create assistant]? Is the id returned by [Upload file] (like this ‘file-')?
What is [Create assistant file] again? Bind the id returned by Upload file (like this 'file-
’) to the created assistant (like this ‘asst_*****’)?
What is the relationship between the file_ids of [Create assistant] and the files in [Create assistant file]?

I want to do the steps of retrieving private knowledge

  1. Upload files and generate file IDs
  2. Create an assistant (file_ids will contain the file id)
  3. Create a helper file? (I’m not sure what this step does, or if it’s necessary)
  4. Create a thread to run the assistant?

I was facing the same issue. what I have done is

I first received the binary data at the backend, saved it locally into a pdf file, and then uploaded that pdf file to the assistant.

const filePath = path.join(__dirname, "../uploads", file.originalname);
    fs.writeFileSync(filePath, file.buffer);
    const fileStream = fs.createReadStream(filePath);
    const uploadedFile = await openai.files.create({
      file: fileStream,
      purpose: "assistants",
    });

Using Node.

the documentation sais it should be file of the file ‘object’ then if i look at that object i do not see the data type there or any reference to binary (stream) or whatever, so the file upload documentation to me is also confusing. i use a different platform where i have to make the raw request by hand, (not by any fuction library in the documentation) so any advice how you got it working or why they reference the file object definition ?

1 Like