Help with 413 The data value transmitted exceeds the capacity limit Error in OpenAI Vector Store Upload

I’m currently working on a project where I need to upload a PDF file to a vector store using OpenAI’s API. However, I’m encountering a 413 The data value transmitted exceeds the capacity limit error when trying to upload the file as a buffer. Here’s a summary of my issue:

Context:

  • Environment: Node.js
  • File: PDF
  • Error: 413 The data value transmitted exceeds the capacity limit
const fs = require('fs');

async function createVectorStore(file, vectorStoreName) {
  try {
    const vectorStore = await openai.beta.vectorStores.create({
      name: vectorStoreName,
    });

    // Read the file as a buffer
    const fileContent = fs.readFileSync(file.path);
    console.log('fileContent', fileContent);

    // Upload the file content to the OpenAI vector store
    const fileBatch = await openai.beta.vectorStores.fileBatches.uploadAndPoll(vectorStore.id, {
      files: [fileContent], // Pass the buffer directly
    });

    console.log(`Vector store created with ID: ${vectorStore.id}`);
    console.log(`file batch status: ${fileBatch.status}`);
    return vectorStore;
  } catch (error) {
    console.error("Error during document upload:", error.response ? error.response.data : error);
    throw error;
  }
}

APIError: 413 The data value transmitted exceeds the capacity limit.
    at APIError.generate (/path/to/node_modules/openai/error.js:68:16)
    ...
{
  status: 413,
  headers: {...},
  error: {
    message: 'The data value transmitted exceeds the capacity limit.',
    type: 'server_error',
    code: null,
    param: null
  }
}

  • I’m reading the PDF file as a Buffer using fs.readFileSync().
  • I then attempt to upload the buffer to OpenAI’s vector store using the fileBatches.uploadAndPoll method.
  • The file size is roughly 570KB.

Any guidance or suggestions would be greatly appreciated!

It sounds like you have a large misunderstanding of the process to add files to a vector store, which is:

  1. upload to your account’s storage by the files endpoint, with purpose “assistants”, and receive a file ID as a return value out of the response.
  2. create a vector store through its endpoint, and receive a vector store ID.
  3. add files IDs to the vector store.
  4. use the vector store ID with the file_search parameter of an assistant.

Each of these has usage documented in the API reference, linked in the sidebar of this forum.