Vector indexing takes forever and vector_stores.files.list returns incorrect results

Hello,

I have two issues with vector stores (these issues are potentially related).

First, on the Dashboard, my vector store has 0 file:

Yet when querying using the API, it has one file (`in_progress`):

import asyncio
import logging

from openai import AsyncOpenAI

client = AsyncOpenAI()

async def test_vs_file_list():
    vector_store_id = "vs_690392abfb6881919b48c074872cb5d6"
    paginator = client.vector_stores.files.list(vector_store_id=vector_store_id)
    async for item in paginator:
        logging.info(f"Vector store file: {item}")

async def main():
    await test_vs_file_list()
    

asyncio.run(main())

The above code gives:

2025-11-06 18:22:43,089 - INFO - HTTP Request: GET https://api.openai.com/v1/vector_stores/vs_690392abfb6881919b48c074872cb5d6/files "HTTP/1.1 200 OK"
2025-11-06 18:22:43,102 - INFO - Vector store file: VectorStoreFile(id='file-JLj86gi4qwW29BDyWUc2U9', created_at=1761841905, last_error=None, object='vector_store.file', status='in_progress', usage_bytes=8667749, vector_store_id='vs_690392abfb6881919b48c074872cb5d6', attributes={}, chunking_strategy=StaticFileChunkingStrategyObject(static=StaticFileChunkingStrategy(chunk_overlap_tokens=400, max_chunk_size_tokens=800), type='static'))

Second issue: As you can see from the above logs, the file was uploaded on October 30th (i.e., a week ago), yet it’s still “in_progress”. Is this a bug?

Thank you in advance for your help!

1 Like

I forgot to mention that the file is quite big, 39.8 MB.

Docs (only under “file search and retrieval” → “retrieval”)

The maximum file size is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).

If English text was extracted, you’ll get a compression ratio about 4:1, so you may exceed the token count at 15MB - 20MB. Non-latin languages are worse.

The ratio and tolerance for file size may be better if a PDF or other binary with lots of images that have no search text.

The platform site code may only display completed status files. It uses a list method similar to the API, if you were to monitor network requests with browser developer tools and want to see if OpenAI’s code is getting a different files listing.

“Last active” was two minutes after creation. You can either delete the file attachment, or delete the vector store ID and try again.

Is this the endpoint that you’re referring to?

https://platform.openai.com/docs/api-reference/vector-stores-files/listFiles

Overall, it seems that that endpoint is very buggy. I posted a related issue:
Vector Store List Files Bug - Deleted files show as In Progress - #3 by olivier.cuny

What do you get when doing query with this endpoint:

https://platform.openai.com/docs/api-reference/vector-stores-files/getFile

Are you still getting a status of “in_progress” or it is simply not finding your file?

1 Like