In our application, very often we want the API to just return a table without any comments.
For instance, we need to make a query to OpenAI by the chat completion: “please return a table of all the countries in the world”. We expect to receive only the table without any comments. But sometimes, we still receive comments in the response.
We can try the GPT-4 Turbo. Does anyone know how to ensure receiving just a table? Could we reply on the newly introduced json format or function calling?
There’s a new JSON-mode capability (that I haven’t tried yet) that was just announced yesterday. JSON should be good enough. You can easily convert JSON objects to a table, as long as each JSON object has the same fields.
I’m not sure if you’re wanting CSV text, HTML, or an array of arrays, but regardless converting JSON to that stuff takes like 4 lines of code. It’s not a task for an LLM when you can do it in 4 lines of your own code, and do it more perfectly also.
Could you be more specific? JSON format is quite new, I don’t how to ask OpenAI to return a JSON of a table (rather than a JSON of a text having some tables and some comments).
If you are looking for an AI that won’t chat back at you, your go-to would be gpt-3.5-turbo-instruct. Being the only competent model that will continue after January 2024 that is not encouraged to chat or be ChatGPT, it’s purely an API developer’s processor.
Here the prompt followed by two carriage returns gets the green playground completion reply:
I know there is a JSON mode like response_format= { "type":"json_object" }. But returning a JSON does not mean it’s the JSON of a table, it could well be a JSON of a text having some tables and some comments.
wclayf, it’s kind of you to reply to my thread. But could you be more precise and give a comment which directly addresses my concern (i.e., table)?
In @_j 's approach above, replace the word “table” with “JSON array”, and replace the word “columns” with the word “properties”, and that will resolve your concerns if you indeed decide to go with JSON output, to build your table from.
It’s easier to “machine parse” the output if it’s JSON, imo. But if you only want to display it in Markdown then @_j 's post generates that.
gpt-3.5-turbo-instruct only supports a tiny 4K of context, and a cutoff date of training data from 3 yrs ago.
GPT-4-turbo has data from 2023, massive context length, is a better AI, and has been specifically tuned to return structured data, if you set the type to JSON.