Hello everyone,
I’m developing an API using FastAPI and OpenAI’s Code Interpreter to design a chatbot that can provide personalized responses and perform data analysis. However, I’ve noticed that my costs are increasing, and I’d like to get some clarity on where OpenAI resources are being utilized and how to optimize their usage.
Problem Summary:
I’m using a function to upload documents, analyze them, and generate personalized responses.
Details:
- Functionality: The user uploads a document (or selects a previously uploaded one) and sends a message for analysis. I’m using OpenAI’s Code Interpreter to analyze the file’s content and generate personalized responses.
- Problem: My OpenAI costs are rising significantly, and I’m not sure where the resources are being consumed (tokens, processing, etc.). I’ve been running tests, but my TPM (Tokens Per Minute) metrics for both input and output are high.
- Objective: I would like to understand:
- Why am I seeing such high costs, even though I’m only running tests?
- How can I optimize the usage of the Code Interpreter and file handling in my API to reduce costs?
- What is the correct way to use tools like
files.create
andthreads.runs.create
in this context to avoid excessive token usage?
The main flow involves uploading files to OpenAI and executing runs to process the file with Code Interpreter. After obtaining the result, I delete the file from OpenAI. Here is the code I’m working with:
My Code:
@app.post(
"/analyze_document/",
tags=["Code Interpreter"],
summary="Upload and analyze a document",
response_model=AssistantResponse,
response_description="The result of the document analysis."
)
async def analyze_document(
thread_id: str = Form(...),
assistant_id: str = Form(...),
file: UploadFile = File(None), # Ahora es opcional
file_id: str = Form(None), # Agregamos el parámetro file_id
message: str = Form(...)
):
temp_file_path = None
# Verifica si se proporcionó un archivo o un file_id
if file:
# Crear un archivo temporal desde el archivo subido
with tempfile.NamedTemporaryFile(delete=False, suffix=file.filename) as temp_file:
temp_file.write(await file.read())
temp_file_path = temp_file.name
elif file_id:
# Recuperar el archivo desde GridFS
file_id = ObjectId(file_id)
grid_out = fs.get(file_id)
# Crear un archivo temporal desde GridFS
with tempfile.NamedTemporaryFile(delete=False, suffix=grid_out.filename) as temp_file:
temp_file.write(grid_out.read())
temp_file_path = temp_file.name
else:
raise HTTPException(status_code=400, detail="Either file or file_id must be provided.")
# Subir el archivo a OpenAI desde el archivo físico
with open(temp_file_path, "rb") as file_obj:
file_response = cliente.files.create(
file=file_obj,
purpose='assistants'
)
openai_file_id = file_response.id
cliente.beta.threads.update(
thread_id=thread_id,
tool_resources={"code_interpreter": {"file_ids": [openai_file_id]}}
)
message = cliente.beta.threads.messages.create(
thread_id = thread_id,
role = 'user',
content=message
)
run = cliente.beta.threads.runs.create(
thread_id = thread_id,
assistant_id = assistant_id,
)
# Esperar a que el run esté completo y obtener mensajes
text_content = ""
while True:
run_status = cliente.beta.threads.runs.retrieve(
thread_id=thread_id,
run_id=run.id
)
if run_status.status == 'completed':
messages = cliente.beta.threads.messages.list(
thread_id=thread_id
)
# Procesar solo los mensajes asociados con el run_id
text_content = process_messages(messages, run.id)
break
else:
time.sleep(2)
cliente.files.delete(openai_file_id)
return AssistantResponse(message=text_content)
My Specific Questions:
- Is the process of uploading and then deleting files in each request efficient in terms of costs? Should I consider reusing files in some way?
- Does the Code Interpreter tokenize the entire file, or just the portion that is actually processed?
- Is there any way to optimize this flow so that I don’t consume so much during my tests?
- How can I correctly use OpenAI’s tools to keep costs under control while designing my chatbot?
I would greatly appreciate any guidance or advice from the community.
Thanks!