Upload of a Json file, same question, same model but different answers

Hello,

I am currently working on converting Excel files to the JSON format. Unlike ChatGPT, OpenAI’s API does not accept direct uploads of Excel files. It seems that you need to use a format recognized by the model, such as JSON.
I have therefore created a JSON file that I uploaded to the GPT-4 model in three different ways. The same question gives me three different answers.

Use case:
Upload of a JSON file with a list of subscribers (persons).
The number of entries for subscribers in the Json file is 5000.

Here is a sample of the file with two records.

Question to the model:
How many times do you find the first name Judie in the subscribers ?

  1. From plateform.openai.com (API)
  • Assistants Playground (Assistant V2, model gpt-4o).
  • File is attached to the thread.

Model reply: The first name “Judie” appears 20 times in the subscribers list[1].

  1. From Microsoft Azure OpenAI studio
  • Assistant Playground (Assistant V2, gpt-4o version:2024-05-13)
  • File is attached to the thread.

Model reply: The first name “Judie” appears once in the subscribers list【4:0†source】.

  1. From chatgpt.com
  • File uploaded in the prompt.
  • model gpt-4o

Model reply: The first name “Judie” appears 100 times in the subscribers list.

The right answer is given by ChatGPT (100 times).

Any idea of what is going on and why the model is not providing the right answer with OpenAI API and OpenAI service on Azure ?

Thank you for any feedback.

LLMs are not particularly good at counting (currently). And most likely when you use the chatGPT interface it is using code interpreter to count the instances using a python script in the background.

Thank you for you feedback.

If I understand correctly, in the development of my chatbot, if I want this chatbot to be capable of answering (more or less complex) questions about the content of an Excel document, I will need to:

1] Convert the Excel file to JSON (done by my chatbot application).
2] Send the JSON file along with the user’s question in the prompt and ask the model to generate the code (Python or C#) on-the-fly to process the file and obtain the result, then return this code to the chatbot application (done by OpenAI).
3] Execute the code with the JSON file as input (done by my chatbot application) and retrieve the results.
4] Send the results back to the model to translate them into natural language.
5] Return the model’s response to the user.

I think I will try using OpenAI’s function call mechanism for steps 3, 4, and 5.

Any comments or advice are welcome.
Thank you.