Does OpenAI don't like us to send structured data in the prompt?

Something is really strange here, again.

Using the normal API for GPT 3.5 (or the web version) I was wasting 4 days trying to feed json string into the API for simple analysis. (it is a table of data)
When everything failed (the chat respond where stupid), i decided to just feed it with simple lines of data and a query, so for example:

#index = 36, date = 2023-07-24 00:00:00, value = 192.75,
#index = 37, date = 2023-07-25 00:00:00, value = 193.62,
#index = 38, date = 2023-07-26 00:00:00, value = 194.5,
#index = 39, date = 2023-07-27 00:00:00, value = 193.22,
#index = 40, date = 2023-07-28 00:00:00, value = 195.83,
#index = 41, date = 2023-07-31 00:00:00, value = 196.45,
#index = 42, date = 2023-08-01 00:00:00, value = 195.605,
#index = 43, date = 2023-08-02 00:00:00, value = 192.58,
#index = 44, date = 2023-08-03 00:00:00, value = 191.17,
#index = 45, date = 2023-08-04 00:00:00, value = 181.99,
#index = 46, date = 2023-08-07 00:00:00, value = 178.85,
#index = 47, date = 2023-08-08 00:00:00, value = 179.8,
#index = 48, date = 2023-08-09 00:00:00, value = 178.19,
#index = 49, date = 2023-08-10 00:00:00, value = 177.97,
#index = 50, date = 2023-08-11 00:00:00, value = 177.79,
#index = 51, date = 2023-08-14 00:00:00, value = 179.46,
#index = 52, date = 2023-08-15 00:00:00, value = 177.45,

Now if you ask it when was the value higher than 195? (both web or API)
In a good case scenario it will ignore the floating point and say only in index 41.

In the bad cases, it will simply give you many indexes such as 36,37, etc, where you can clearly see that the value is lower than 195. These mistakes are brutal, nothing sophisticated was asked.
Nothing is constant.

I tried EVERYTHING, it is random, it is wrong all the time, you can also ask when was it lower than certain value?, and also get wrong answers.

  1. Is it in purpose to avoid someone doing something before their plugins ? (they trick their devs all the time)
  2. Is there any official way to feed json/string/table into the API ? there is nothing about this anywhere in thier documentation FOR SOME REASON.

Any help will be appreciated on how to professionals feed a table into the API.

GPT is a large language model. Large language models are yet not very good in analyzing quantitative data. For quantitative data analysis you should use code interpreter or your own version of a similar tool.

create a function, call the function whenever you need such query. you will probably be getting your data from some endpoint anyway. but even if you are loading it from say a JSON file, you can still invoke a function to make your analysis and just use the AI to hand the result to you in a conversational way.

wait but can’t gpt find a maximum point in a list ? :slight_smile:
like, it can analyze full excel sheets and cant find what number is larger than another ?

something strange is going one here, whenever you try to do something advanced enough they stop you. 6 months ago they did the same by slowing down responds.
they can recognize all sorts of amazing stuff and can’t analyze a list…

I cant use functions for this because it is not clear what user will ask, you cant build 1000 functions for each scenario

To insert JSON into a message I simply do that as a string, so no magic here. However I noticed that you need to structure it in a efficient way and also provide the model with some guidance about the data structure. You can have a look here how I implemented it on GitHub: nicolaszimmer/json-compact I’m processing health and activity data for analysis and it works pretty well, although GPT-4 is much better at complex queries and code interpreter is a big plus.

if you do that, it will interpret it as a text stream which means it will go from left to right and from top to bottom. However, structured data does not have a left to right order as such and each record is independent so no top to bottom inference either should exist but LLMs will make one up and the results will be inconsistent. Its a language model and not a query/analytics model.