From the docs: https://platform.openai.com/docs/assistants/tools/file-search
By default, the
file_search
tool uses the following settings but these can be configured to suit your needs:
- Chunk size: 800 tokens
- Chunk overlap: 400 tokens
- Embedding model:
text-embedding-3-large
at 256 dimensions- Maximum number of chunks added to context: 20 (could be fewer)
- Ranker:
auto
(OpenAI will choose which ranker to use)- Score threshold: 0 minimum ranking score
I’m currently creating thread like below:
thread = openai.beta.threads.create(
messages=[
{
"role": "user",
"content": final_query,
"attachments": [
{"file_id": file_id, "tools": [{"type": "file_search"}]}
for file_id in file_ids
],
}
]
)
Can you help me change parameters like chunk size, embedding model etc.?
I can’t find any docs on that.