Hi everyone,
I’m working on a Python script that uploads a spreadsheet (.csv
, .xlsx
, or .xls
), uses OpenAI’s code_interpreter
tool to make modifications based on a prompt, and then retrieves the updated file. My goal is to automate this process so that the spreadsheet can be modified dynamically and the updated version can be saved locally.
Here’s a summary of what I’m trying to do:
- Upload a spreadsheet file to OpenAI (in this case, I’ve converted it to CSV for consistency).
- Use OpenAI’s API to create an Assistant with the
code_interpreter
tool and provide instructions to modify the spreadsheet (e.g., add a new column with specific values). - Retrieve the modified file or content and save it locally as a new file.
The issues I’m facing:
- Although my script seems to run without major errors, the updated spreadsheet content is not being correctly modified or retrieved.
- The Assistant often returns text responses instead of actual file content.
- I receive messages indicating file parsing issues, such as delimiter problems in the CSV.
- When attempting to fetch the updated content, the file either does not get updated as expected or does not get saved locally.
Specific Questions:
- Am I using the correct endpoints for this task (
files.create
,assistants.create
, etc.)? - Is the
code_interpreter
tool the best approach to use for spreadsheet manipulation, or is there another tool within OpenAI’s API that might work better? - What is the recommended way to structure prompts for spreadsheet modifications to ensure the Assistant performs the requested tasks correctly (e.g., adding a column with specific values)?
- How can I ensure the modified spreadsheet is properly retrieved and saved locally?
Here is the full script I’m using:
python
Copy code
import os
import time
import pandas as pd
from openai import OpenAI
# Initialize the OpenAI client with your API key
client = OpenAI(api_key='YOUR_OPENAI_API_KEY')
# Define the directory where the updated file will be saved
output_dir = "/path/to/your/output/directory"
output_file = os.path.join(output_dir, "updated_spreadsheet.csv")
# Ensure the output directory exists
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Step 1: Convert the Excel file to CSV
input_excel = "/path/to/your/input/spreadsheet.xlsx"
csv_file_path = os.path.join(output_dir, "test.csv")
# Convert Excel to CSV
df = pd.read_excel(input_excel)
df.to_csv(csv_file_path, index=False)
# Step 2: Upload the CSV file to OpenAI
try:
with open(csv_file_path, "rb") as file:
file_response = client.files.create(file=file, purpose='assistants')
file_id = file_response.id
print(f"First file uploaded: {file_id}")
except Exception as e:
print(f"An error occurred while uploading the file: {e}")
exit()
# Step 3: Create an Assistant with the code_interpreter tool
try:
assistant = client.beta.assistants.create(
name="Spreadsheet Assistant",
description="An Assistant for editing and overwriting CSV files.",
model="gpt-4",
tools=[{"type": "code_interpreter"}],
tool_resources={
"code_interpreter": {
"file_ids": [file_id]
}
}
)
assistant_id = assistant.id
print(f"Assistant created: {assistant_id}")
except Exception as e:
print(f"An error occurred while creating the Assistant: {e}")
exit()
# Step 4: Start a thread with instructions to modify the spreadsheet
try:
thread = client.beta.threads.create(
messages=[
{
"role": "user",
"content": (
"Please load the CSV file, add a new column called 'Test Column' with the value 'test' "
"for all rows, and return the updated CSV file as a download."
),
"attachments": [
{
"file_id": file_id,
"tools": [{"type": "code_interpreter"}]
}
]
}
]
)
thread_id = thread.id
print(f"Thread created: {thread_id}")
except Exception as e:
print(f"An error occurred while creating the thread: {e}")
exit()
# Step 5: Run the thread and wait for the Assistant to complete the task
try:
run = client.beta.threads.runs.create(
thread_id=thread_id,
assistant_id=assistant_id
)
run_id = run.id
# Poll for the run to complete
run_status = run.status
while run.status == "queued" or run.status == "in_progress":
run = client.beta.threads.runs.retrieve(
thread_id=thread_id,
run_id=run_id
)
time.sleep(0.5)
run_status = run.status
print(f"Run status: {run_status}")
if run_status == "completed":
print("Run completed successfully!")
else:
print(f"Run failed with status: {run_status}")
exit()
except Exception as e:
print(f"An error occurred during the run: {e}")
exit()
# Step 6: Retrieve the updated CSV content from the Assistant's response
try:
print("Step 6: Fetching the list of messages in the thread...")
# Retrieve the list of messages
messages_response = client.beta.threads.messages.list(thread_id=thread_id)
messages = messages_response.data
print(f"Total messages retrieved: {len(messages)}")
csv_content = None
# Inspect the content of each message
for idx, message in enumerate(messages):
print(f"Message {idx + 1}/{len(messages)}: {message}")
print(f"Message role: {message.role}, run_id: {message.run_id}")
# Check for text content that looks like CSV content
for content_block in message.content:
if content_block.type == "text" and ',' in content_block.text.value:
csv_content = content_block.text.value
print(f"Found potential CSV content in message {idx + 1}")
break
if csv_content:
break
# If CSV content was found, save it locally
if csv_content:
print("Saving the updated CSV content locally...")
with open(output_file, "w") as f:
f.write(csv_content)
print(f"Updated CSV saved successfully as '{output_file}'.")
# Optionally, save as Excel
df = pd.read_csv(output_file)
df.to_excel(os.path.join(output_dir, "updated_spreadsheet.xlsx"), index=False)
print(f"Updated Excel file saved as 'updated_spreadsheet.xlsx'.")
else:
print("No CSV content found in the Assistant's response.")
except Exception as e:
print(f"An error occurred while retrieving the updated content: {e}")
Any suggestions or guidance would be greatly appreciated! If someone has successfully used the OpenAI API to modify a spreadsheet and retrieve the updated file via Python, I’d love to hear how you approached it. Thanks in advance!