Openai vector stores -- is chunking still required or does openai do it for me?

I’ve read competing documentation on whether chunking is required. I’ve read that it used to be required but that documents are automatically chunked now when loaded into vector-stores.

Is that true only when loaded via API, or is it true when loadng thru the dashboard, or both?

I ask because a 4MB PDF file discussing insurance underwriting rules is taking up to 2 minutes using the /responses API to answer questions. I’m unsure what the obvious next steps might be to speed it up (if possible).

The chunking is automatic and necessary. The chunks are the units returned to the AI model for understanding by similarity search, and can’t be bigger than 4000 tokens.

You have an optional chunking parameter when you connect a file ID to a vector store, to set individual files, or a parameter to set on the vector store when providing an initial batch of file IDs.

For create (pulled from a function):

chunking_strategy={
    "type": "static",
    "static": {
        "max_chunk_size_tokens": 600,
        "chunk_overlap_tokens": 200,
    }
}

This is optional, where default is 800, 400 (400 tokens overlap the earlier chunk)

Ingesting the file requires document extraction of text from PDF and creating vectors from embeddings API model calls, and preparing the database. Extraction computation should be fast, as 4MB is a blip these days, but if the PDF is rich in text, it could be making 500+ embeddings by AI before being ready.

Vector stores have had a slough of API problems, though, recently.

Move up the processing as early as possible: for example, uploading and attaching to a chat session vector store you create the second that a user has added a file to a chat UI.

Additional tip: in your developer message, when using your own constant vector store file search tool, tell the AI what it will find from file_search. Use the Responses API parameter “max_tool_calls” so the AI doesn’t continue searching with more queries when it is unhappy with results returned.