Json responses in gpt-3.5-turbo-1106 much shorter than without json?

TonyAIChamp · December 5, 2023, 8:05am

Hey Champs

Anyone noticed that responses in json are much shorter than without it?

Here are 2 examples of completions:

plain answer

Thank you for sharing your CV and the job description. To optimize your CV for the vacancy, I'd like to ask a few questions to gather the missing information:

1. **Financial Reporting and Analysis:** Can you provide specific examples of how you've optimized data analysis and financial reporting to support strategic, operational, and financial goals in your current or previous roles?

2. **Budget Management and Development:** Have you overseen and managed budgets for divisions or projects in your current or previous roles? Any specific achievements related to budget management that you can highlight?

3. **Leadership and Management Experience:** Have you directly managed or supervised financial analysts or other operations staff in your current or previous roles? If so, what were your key responsibilities in managing and leading a team?

4. **Financial Systems/Applications:** Do you have experience with specific financial systems or applications that are relevant to the job description, such as Salesforce, Cognos, or Adaptive reporting and budgeting software?

5. **Project Leadership and Collaboration:** Can you provide examples of how you've collaborated with cross-functional teams or led financial and budgetary initiatives to ensure successful implementation and compliance across departments and programs?

6. **Professional Development and Certifications:** Have you pursued any additional certifications, professional development programs, or training related to financial analysis, reporting, or budget management since completing your MBA in 2012?

7. **Soft Skills and Interpersonal Abilities:** Can you highlight any specific instances where your strong interpersonal skills and ability to work in a diverse work environment have contributed to successful financial initiatives or projects?

Once I have this information, I can help tailor your CV to better align with the job description and make it a perfect fit for the vacancy.

json answer

{

  "ai_message": "Great! To optimize your CV for the vacancy, I need some additional information. Could you please provide details about any experience you have with financial databases, accounting, and financial systems/applications? Additionally, do you have any experience with Salesforce, Cognos, or Adaptive reporting and budgeting software? This information will help tailor your CV to match the qualifications required for the position at WestEd.",
  "status": "collecting_missing_information"
}

These completions were received from the executions in the same model (gpt-3.5-turbo-1106) with the same parameters (“temperature”: 0, "response_format: {“type”: “json_object” }, “timeout”: 5) and exactly same prompts with just 1 difference, in the end of the system prompt for ‘json answer’ case I added:

Respond only in JSON format with the following fields:
- "ai_message", rules:
    - your next message to the Human
    - ai_message must always be in plain text or Markdown where applicable and never - json
- "status", the only possible values (you must go through all of the statuses in this order):
    - "collecting_missing_information"
    - "optimizing_cv"

Anyone have some creative workaround?

ramn7 · December 5, 2023, 10:59am

I’d guess this might be expected, you have a different system message and the model responds to it as it does.
If you want a more detailed response, I’d try to elaborate in the system message on stuff you’d want it to include.

** btw- have you used "response_format: {“type”: “json_object” } for both of them? it’s supposed to return json always. Do you mind providing the rest of your prompt?

TonyAIChamp · December 5, 2023, 11:01am

One more interesting thing I noticed on another pretty lengthy prompt.

See the consistent difference (did 20 tests on each side) on 2 examples:

Response with Prompt 1

Based on the job description provided, here are some recommendations to optimize your CV:\n\n1. Professional Summary: Your professional summary is strong, but it could be tailored more specifically to the job description. Highlight your experience in optimizing data analysis and financial reporting to support strategic, operational, and financial goals.\n\n2. Work Experience: Your work experience is relevant, but it would be beneficial to emphasize your experience in managing financial databases, budget management, and overseeing project budgets. Quantify your achievements where possible, such as the impact of your financial analysis and reporting on business decisions.\n\n3. Skills Section: Ensure that your skills section includes a mix of hard and soft skills that align with the qualifications listed in the job description. Highlight your expertise in financial analysis, financial modeling, budgeting, and proficiency in Microsoft Office Suite.\n\n4. Additional Sections: Consider adding a section for any relevant professional development, certifications, or specific projects that demonstrate your expertise in financial analysis and reporting.\n\n5. Tailored Content: Make sure that your CV is tailored specifically to the job description, emphasizing the skills and experiences that align with the responsibilities and qualifications listed.\n\n6. Contact Information: Double-check that your contact information is up-to-date and professional.\n\nBy optimizing your CV to better align with the job description, you can increase your chances of standing out as a strong candidate for the position. Good luck with your application!

Response with Prompt 1 + additional instruction: "Once the Human answers all of your questions, respond with 1 word and 1 word only: COMPLETED

What specific financial analysis, financial modeling, and budget management experience do you have that aligns with the responsibilities of the job description?

There is a clear patter here. I cannot yet understand what exactly it is, but I’ll continue digging and will share the results.

TonyAIChamp · December 5, 2023, 11:03am

I’d guess this might be expected, you have a different system message and the model responds to it as it does.

Not exactly. You can see another example in my follow up message right above.

** btw- have you used "response_format: {“type”: “json_object” } for both of them? it’s supposed to return json always. Do you mind providing the rest of your prompt?

Good catch, let me double check!

ramn7 · December 5, 2023, 11:24am

I think it’s possible there’s a clear pattern, with some reason behind it.

I still think the better solution would be to have the system prompt to specifically describe things you’re expecting there (I’m assuming you don’t just expect lengthy garbage text, but even if you do- you can probably “ask” for that too :))

TonyAIChamp · December 5, 2023, 11:24am

Did the test on 1106 without "response_format: {“type”: “json_object” }.

With the ending instruction “Respond only in JSON format with the following field: “ai_message”: your response to the Human”: on average 120 tokens output (including json formatting).

Same prompt, but without that instruction: on average 300 tokens output.

Did the test on 3.5-turbo-1106 without "response_format: {“type”: “json_object” }

TonyAIChamp · December 5, 2023, 11:27am

That seems to be the problem with json output. It seems to start ignoring many of the instructions when I ask to output in json. Like for example it consistently outputs very short messages with promises of something ("here’s your detailed plan: " - and nothing after the semicolor) and no instruction makes it consistently change the behaviour.

ramn7 · December 5, 2023, 11:46am

Ok, that’s interesting, I actually haven’t noticed this kind of behavior before.

Would be great if you could provide the entire prompt, interesting to check this out.

TonyAIChamp · December 5, 2023, 11:53am

I want to do a bigger testing, but my guess is that it will work with any prompt. Working hypothesis is that trying to change the format of the output significantly affects the quality of following instructions.

ramn7 · December 5, 2023, 11:56am

Ok, I haven’t noticed it myself, update on what you come up with.
Just something to consider: length isn’t necessarily quality

sps · December 5, 2023, 11:58am

In my opinion it’s expected, given how most json objects don’t have lengthy values.

TonyAIChamp · December 5, 2023, 12:03pm

That is a valid point and I would consider it an explanation if not for this Json responses in gpt-3.5-turbo-1106 much shorter than without json? - #3 by TonyAIChamp

TonyAIChamp · December 5, 2023, 12:04pm

Obviously it is not, generally speaking, but in my specific case it is one of the factors of the quality.

greg_kostello · December 5, 2023, 7:37pm

Hi @TonyAIChamp - the way I solved this is to have a two-step conversation with the model. I first ask for the result for NLP response and then do a new message with the test “as JSON”. You may give an example in your system prompt for the JSON to make sure that the output conforms.

Code looks something like this:

    prompt = system_prompt
    if prompt:
        additional_messages = [{"role": "user", "content": original_message}]
        response = get_chat_completion_response(prompt,additional_messages)
        if response != None:
            content = response.choices[0].message.content
            if content:
                assistant_message = {"role": "assistant", "content": content}
                additional_messages.append(assistant_message)
                json_message = {"role": "user", "content": "as JSON"}
                additional_messages.append(json_message)
                json_obj = extract_json_obj_from_prompt(config_prompt,additional_messages)
        else:
            return None

    return json_obj

curt.kennedy · December 5, 2023, 7:48pm

The model may also be preconditioned to the fact that JSON fields in its training data may not have been too big. That was my first guess.

If you really really need the JSON with a big response chunk, then maybe call it normally without JSON, and then just add the big response to your JSON manually.

TonyAIChamp · December 6, 2023, 12:02am

Thank you for the idea, Greg!

I ended up doing also 2-step, but a bit different: I first get a normal response from the model, and then ask a model in a separate conversation to give me the status of the conversation in json format (feeding it the main conversation and providing criteria for setting statuses).

TonyAIChamp · December 6, 2023, 12:04am

Thank, Curt! As mentioned above, this is a really great guess.

The problem with it as a hypothesis for the reason of this specific behaviour is that it spans outside of json output: Json responses in gpt-3.5-turbo-1106 much shorter than without json? - #3 by TonyAIChamp

curt.kennedy · December 6, 2023, 12:11am

What is this “timeout” parameter?

Wondering if that is constraining something.

TonyAIChamp · December 6, 2023, 12:14am

Constraining just the time of the execution after which the model will throw a timeout error.

curt.kennedy · December 6, 2023, 12:18am

Hmm, what happens if you drop this or make it really big?

Topic		Replies	Views
Davinci still seems like the gold standard, compared to turbo API	23	4770	April 21, 2023
JSON Mode with GPT-4 turbo stops after 1050 token Bugs api , gpt-4-turbo	26	4834	February 5, 2024
Why is ChatGPT so moody, refusing to answer the exact same question it answered previously? API	14	5151	March 27, 2023
Chat GPT4 1106 vs ChatGPT 4: Impressive drop in quality API gpt-4 , chatgpt	27	15616	February 14, 2024
Gpt-3-5-turbo-1106 either timeout or gives radically different result from gpt-3.5-turbo-16k API gpt-35-turbo	9	3443	December 4, 2023

Json responses in gpt-3.5-turbo-1106 much shorter than without json?

Related topics