I am using the api via cURL to ask questions about some files in a vector store.
- I create the vector store, and add two files using the file_batches endpoint.
- I create a new assistant, and then attach a new thread. The vector store is attached to the thread.
- I am using the default chunking strategy (800 tokens max, with 400 token overlap).
When I ask a question, it is clear that the assistant is answering using the information from only one file. If I ask a question that requires information from the other file, the response is “I don’t know”, or similar.
If I retrieve information on the vector store contents, I can see that both files have (apparently) been added successfully.
The files a simple UTF8 text files each containing a few thousand words in English.
If I repeat the process, but add only one or other other of the files, the assistant (set up with the same arguments as before) can provide cogent answers to questions to do with that one file, showing that the file is perfectly readable, and the vectorising process is OK.
Can anyone suggest what I’m doing wrong?
{
"files":{
"data":[
{
"chunking_strategy":{
"static":{
"chunk_overlap_tokens":400,
"max_chunk_size_tokens":800
},
"type":"static"
},
"created_at":1726003029,
"id":"file-F6bVnGrVlB8CgEfkIUOrQGIt",
"last_error":null,
"object":"vector_store.file",
"status":"completed",
"usage_bytes":1073,
"vector_store_id":"vs_FQYDspit4lQaKuJDetPd5HGt"
},
{
"chunking_strategy":{
"static":{
"chunk_overlap_tokens":400,
"max_chunk_size_tokens":800
},
"type":"static"
},
"created_at":1726003029,
"id":"file-GLPcOLKGf1XMmt5fAntzuIdC",
"last_error":null,
"object":"vector_store.file",
"status":"completed",
"usage_bytes":1076,
"vector_store_id":"vs_FQYDspit4lQaKuJDetPd5HGt"
}
],
"first_id":"file-F6bVnGrVlB8CgEfkIUOrQGIt",
"has_more":false,
"last_id":"file-GLPcOLKGf1XMmt5fAntzuIdC",
"object":"list"
},
"store":{
"created_at":1726003028,
"expires_after":{
"anchor":"last_active_at",
"days":60
},
"expires_at":1731188861,
"file_counts":{
"cancelled":0,
"completed":2,
"failed":0,
"in_progress":0,
"total":2
},
"id":"vs_FQYDspit4lQaKuJDetPd5HGt",
"last_active_at":1726004861,
"metadata":{
},
"name":"B0013B2D724E0EBAB67AE58BC34A939F",
"object":"vector_store",
"status":"completed",
"usage_bytes":2149
}
}