Is there any better approach to get a Json from gpt?

I’m using the API to analyze conversations between people and extract different factors like:
-was the last message a question?
-was any media requested?
-did the last user talking insulted the other one?
Etc…

In order to extract this data, i use the completion endpoint and ask it to export a Json with these parameters (with a key name specified for each one)

It kind of works 90% of the time but there are times when the output is formatted wrong or when the answer is simply wrong.

Is there any better approach?

2 Likes

You could force it to do a function call, which always (?) returns properly structured JSON. Well, it doesn’t really DO the call, it just designs arguments to fulfill your functions parameter requirements.

So you could pass the last message, and tell it to call was_last_message_a_question(bool wasQuestion)
and i think it’ll return:

{"wasQuestion":"true"}

You can receive multipple arguments in one call, but I think forcing it to consider too many aspects would degrade reliability. But, each call is more tokens so…

3 Likes

Although, with function use, you’re adding quite a lot of tokens to give the function schema. but, you can also drop any instructions about “reply in json”.

Or, this might be a good candidate for fine-tuning

2 Likes

Try to use:
response_format={“type”: “json_object”}

https://platform.openai.com/docs/guides/text-generation/json-mode

1 Like

In my experience refining the prompt often works best. You can also use json mode as mentioned before.

I’ve actually created a dev platform for this very specific use-case: helping with developing and testing structured prompts on whole sets of queries and expected outputs. You’re more than welcome to check it out: https://www.promptotype.io

I’ve tried a few things suggested in many other places (e.g. use [no prose], be explicit about wanting a JSON response, etc).

For my use case (which was generating an cron expression from plain english), I’ve found that non-JSON responses happened always when there was some error (e.g. no cron expression could be constructed), so I simply wrapped JSON.parse(response) around a try...catch and it’s been working well.

I’m using the Assistants API to do something similar. My approach is to use the system prompt to tell it I want the output in JSON, give it an example of what I want, and tell it not to provide anything else. The exact prompt phrases I use is:

Provide your response as a JSON structure in the form:
{
  "item1": {
    "foo": "bar"
  },
  "item2": {
    "bar": "foo"
  }
}

Include no other commentary.

Here’s my exact prompt:

openai.callOpenAIPrompt([
    {role: "system", content: "You are a skilled developer and understand how to configure cron jobs."},
    {role: "user", content: `[no prose]\nTranslate this text into a cron expression:\n\n${cronText}.`},
    {role: "user", content: `Output the result as a JSON object with 2 properties:\n"cron" with the resulting expression\n"error" with the error message if any, including any ambiguous results.`},
    {role: "user", content: `If you are unable to create a straightforward result, return "error": "Text cannot be turned into a cron expression"`},
  ], (e, r) => {
...

It works almost always, but sometime it doesn’t and produces a longer narrative explaining why. I might try the example as you did. That has helped in other contexts.