Answering questions about excel-sheet - random response?

Hello - i try to answer a question about an excel-sheet using the following code. My problem is that the answer (YES or NO) seems to be randomly outputted - so when i run the program serveral times i have sometime YES as answer and sometime NO as answer…

How could this be?
Is my prompt not optimal or why is this not allways working?

import os
import sys
from dotenv import load_dotenv
from openai import OpenAI

path = os.path.abspath(os.path.dirname(sys.argv[0])) 

fn = os.path.join(path, "gpp", "FILES", "env.env")
load_dotenv(fn)
CHATGPT_API_KEY = os.environ.get("CHATGPT_API_KEY")
client = OpenAI(api_key = CHATGPT_API_KEY)

fn = os.path.join(path, "gpp", "FILES", "test.xlsx")
file = client.files.create(
  file=open(fn, "rb"),
  purpose='assistants'
)

assistant = client.beta.assistants.create(
  instructions="You are a machine learning researcher, answer questions about the data in the excel-sheet. Answer only with YES or NO and nothing else",
  model="gpt-4o",
  tools=[{"type": "code_interpreter"}],
  tool_resources={
    "code_interpreter": {
      "file_ids": [file.id]
    }
  }
) 

print(f"Preparing thread")
thread = client.beta.threads.create()
print(f"Preparing question")
results = client.beta.threads.messages.create(
  thread_id = thread.id,
  role = "user",
  content = "Is Methylene Chlorid in the excel-sheet?"
)

print(f"Running for answer")
run = client.beta.threads.runs.create (
  thread_id = thread.id,
  assistant_id = assistant.id
)
while run.status not in ["completed", "failed"]:
  run = client.beta.threads.runs.retrieve (
    thread_id = thread.id,
    run_id = run.id
  )
if run.status == "completed":
  results = client.beta.threads.messages.list(
      thread_id=thread.id
  )

print((results.data[0].content[0].text.value))
1 Like

Hrm. Maybe ask in the system prompt to explain its answer and run a few times to see what it’s doing?

How big is the Excel sheet? Spreadsheet tables aren’t the easiest for an LLM to “read”… See more here…

1 Like

You have to reformat how you present the spreadsheets data to the LLM. Using the same basic anchoring techniques I outline in that other post, I’ve successfully gotten the model to extract facts from spreadsheets with upwards of a million rows.

You obviously need other techniques to reason over a million anything but it’s doable and the core principals are touched on in that post. I’m planning to create a blog post that dives deeper into the subject but I have a dozen projects I’m working on right now.

2 Likes

Given that these models are just pattern matchers they’re actually excellent at reasoning over spreadsheets if you properly format the data as a large sequence of patterns for them to work with… I’ll give a quick example to help build on this intuition…

| Item              | In Stock | Price |
| :---------------- | :------: | ----: |
| Python Hat        |   True   | 23.99 |
| SQL Hat           |   True   | 23.99 |
| Codecademy Tee    |  False   | 19.99 |
| Codecademy Hoodie |  False   | 42.99 |

The model sees that table in 1D as this:

| Item | In Stock | Price || :---------------- | :------: | ----: || Python Hat | True | 23.99 || SQL Hat | True | 23.99 || Codecademy Tee | False | 19.99 || Codecademy Hoodie | False | 42.99 |

If we ask the model “what’s the price of the codecademy hoodie?” the model has to look at all the tokens below in green while ignoring all of the tokens in red that might confuse it.

The biggest issue is that the Price label/anchor is super far distance wise from the other tokens needed to answer the question. You can see how if you add rows and columns that distance just keeps getting farther and the reliability of the models ability to reason is a direct function of distance between the tokens needed to answer a question.

We can help the model by moving all of these tokens closer together:

Item: Python Hat
In Stock: True
Price: 23.99

Item: SQL Hat
In STock: True
Price: 23.99

Item: Codecademy Tee
In Stock: False
Price: 19.99

Item: Codecademy Hoodie
In Stock: False
Price: 42.99

Now all of the information needed to answer our query is within a few tokens of each other. We’ve dramatically increased the reliability with which we can reason over this data… But there’s more going on here then just moving all these tokens closer together…

The model flattens this sequence out to a 1D line and in the process it’s also mapping all of patterns it sees to higher dimensional concepts that intersect with its world knowledge. So with this reshaped table what the model sees concept wise is this:

That clustering of concepts is where the magic really starts to happen at…

note: the bounding boxes i drew are a simplification of the concepts and patterns the model actually sees. It also sees that the first 2 items are things you wear on your head and the second 2 items are things you’d wear to cover your chest. The last 2 items are related to a website where you can learn coding and all 4 items are apparel items programmers would wear. etc…

The important point is that everything is just a pattern to the model and the more you can shape the information you show the model as a series of repeating patterns the better the responses you’ll get from the model.

1 Like

Hello - the excel sheet is very small - it has only 217 rows
The excel is also very simple - it just looks like that
But its still not working