I’m working on implementing a feature in my real-world application where I need to extract answers from 2-3 uploaded documents using around 20-30 predefined questions. I’ve been using the file_search
tool in the OpenAI Assistant API, and so far, it’s been working well—I’m able to retrieve relevant answers.
Here’s what I’ve done so far:
- Created an assistant.
- Looped through the list of questions (not using async).
- Created a separate thread for each question using
createAndRun
, attaching the relevantfileId
andvector storeId
. - When
message.event === 'thread.run.completed'
, I save the answer. - When
message.event !== 'thread.run.completed'
, I handle it as “unable to fetch at this moment.”
My questions are:
- Can I use a single thread for all the questions, considering that OpenAI will automatically truncate if the context window is full? Would using a single thread be beneficial in this case?
- If a vector store is assigned to a thread, it will search across all the files within that store to find relevant answers. Is it better to maintain different vector stores for different sets of files, or should I explicitly attach specific
fileIds
to the threads? - Are there any other improvements or best practices I could apply to my current approach?
I’m new to this—does my approach sound reasonable, or is there anything I should consider refining?
for (let i = 0; i < questions.length; i++) {
await this.fetchAnswer(
fileId,
vectorStoreId,
questions[i].question,
assistant,
results
);
}
private async processQuestion(...){
return this.openai.beta.threads
.createAndRun({
assistant_id: assistant.id,
thread: {
messages: [
{
role: 'user',
content: question,
attachments: [
{ file_id: fileId, tools: [{ type: 'file_search' }] },
],
},
],
tool_resources: {
file_search: {
vector_store_ids: [vectorStoreId],
},
},
stream: true,
},
}) .then(async run => {
for await (const message of run) {
if (message.event === 'thread.run.failed') {
// save unable to retieve at this moment
} as FileMessageEvent);
return resolve(true);
}
if (message.event === 'thread.run.completed') {
const messages = await this.openai.beta.threads.messages.list(
message.data.thread_id,
{
run_id: message.data.id,
},
);
//save it
}
}