Different results: ChatGPT3.5 vs API (gpt-3.5-turbo)

kychasticus · November 22, 2023, 11:11am

I am trying to parse Job Descriptions into a structured format. Here is my JD:

GPT3.5 Web User Interface output Runtime: instant. A pretty well parsed output with minor mistakes.

{
  "job_title": "Senior Software Engineer - Python",
  "required_skills": ["Python", "Javascript", "AWS", "CI/CD", "Terraform"],
  "additional_skills": ["Typescript", "React", "Redux/MobX", "Kafka"],
  "required_competencies": ["Unit Testing", "Test-Driven Development", "Design Patterns", "Asynchronous Programming"],
  "additional_competencies": ["Event-Driven Architectures"],
  "required_experiences": ["Bachelor's degree in Computer Science or related field", "Solid experience with Python frameworks (e.g., Flask, FastAPI)", "Hands-on experience with AWS services"],
  "additional_experiences": ["Experience with serverless functions", "Experience in setting up CI/CD pipelines using GitLab"],
  "job_location": "Berlin, Germany",
  "work_setting": ["Hybrid", "Remote"],
  "languages": ["Fluent English (required)", "Additional languages a plus"],
  "education": ["Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience)"],
  "base_compensation": "Yearly gross salary (not specified in JD)",
  "additional_compensation": ["Equity", "ESOP", "VSOP", "Performance Bonuses"],
  "benefits": ["Study Budget", "Home Office Budget", "Paid Leave Days"],
  "company_differentiator": ["Innovative technology for easy travel", "Commitment to green technologies and sustainability"]
}

GPT3.5T 16k Assistants API output Runtime: 3.5mins!!! Very disappointing result

"job_title": "Senior Software Engineer (m/f/d) - Python",
  "required_skills": ["Python and Javascript", "configuring Datadog monitoring and observability"],
  "additional_skills": [],
  "required_competencies": [],
  "additional_competencies": [],
  "required_experiences": ["Python frameworks such as Flask or FastAPI"],
  "additional_experiences": ["AWS cloud services and resource management (SNS/SQS/S3/ECS/EC2/lambdas)"],
  "job_location": "Berlin",
  "work_setting": "Organize your own schedule",
  "base_compensation": "up to 60 days of working from",
  "company_differentiator": ["We celebrate diversity"]

The same prompts were used, ofc.

Any hints on how to fix this?
Use GPT3.5T 1106 or pure GPT3.5T?
It is very misleading

Innovatix · November 22, 2023, 12:26pm

Hi @kychasticus
To gain a better understanding of different AI models capabilities, you might consider experimenting with tools like OpenAI Playground or LLM Spark. These platforms offer a hands-on experience that could help in determining the most suitable solution for your specific requirements.

kychasticus · November 22, 2023, 12:49pm

Hey! I am using the Assistant functionality through Python SDK. The Playground is just a visual representation of Assistant API in my case.

Do you have any practical suggestions I can use to make the Assistant API responses match ChatGPT ones?

Innovatix · November 22, 2023, 2:05pm

Using Assistant API we can not control much to get responses like ChatGPT. we have to rely on Clear Instructions and right Model.

yu.shlegel · December 19, 2023, 11:04pm

Thank you so much! This seems to have worked for me! Same as you had problems. It has become much better, all that remains is to monitor from a distance

It’s a shame that there are 100 such topics, and no adequate response from the administration.

Tomorrow I will send your answer to other similar topics on the forum)

UPD Next day: I still get various incorrect results, but there are fewer of them. I continue to search for a solution

viveksonkar19n · December 20, 2023, 10:32am

Same issue with me.

below is the query I’m sending -

{
    "model": "gpt-3.5-turbo",
    "messages": [
        {
            "role": "system",
            "content": "This is a field in excel. So don't return explanation unless asked especifically. Return results only. Please check all the condition carefully",
        },
        {
            "role": "user",
            "content": "value is $100000. If value equals to 15000 then multiply by 3"
        }
    ],
    "temperature": 0.7
}

The problem is it fails to evaluate the condition also. It is giving wrong results.

Can someone let me know what I’m doing wrong here.

_j · December 21, 2023, 3:15am

Spreadsheets don’t have “fields”. They have rows, columns, and cells.
Cells have values, or they have formulas.
Formulas usually make reference to other cells. Formulas that don’t calculate from other data are fixed until modified and not that useful.

So:
1: what you show as user message cannot be anything other than a text cell;
2: the AI is unlikely to differentiate between returning explanations, returning results, or checking;
3: temperature will produce the chance of unlikely “wrong” responses.
4: AI don’t math good.

We can rectify this, even though a human can’t determine the truly correct answer desired.

Here is a system message:

You are ExcelAI. You perform spreadsheet calculations, providing resulting answers for all formula cells supplied in a list of cells. Your output must be the value of cells evaluated out-of-order by precedence of which calculation must be performed first in the spreadsheet cell list provided by the user. No chat, just calculated cells.

Here is a user input

A1: 100000
B1: =IF(A1=15000, A1*3, A1)
C1: A1/4

gpt-4-turbo response at top-p=.01

B1: 100000
C1: 25000

rofungAI · May 16, 2024, 8:25am

came here to research why the web interface (for 4-Turbo) gives pretty good results, but the API is much less helpful. i guess there’s no solution, but ‘me too’ here for posterity.

doctormarys · May 25, 2024, 6:23pm

I encountered the same problem, so is there any good solution now? It was my first time to use the API to operate. I originally thought it had the same capabilities as the window. I tried increasing the initial prompt word setting, which It doesn’t work for me, the api just doesn’t work for me now. I’m using gpt-3.5-turbo-0125 because it’s cheaper.

Munna23 · June 14, 2024, 6:12pm

It boils down to good system prompt with proper temp and top_p values that fits your use-case.

rabia.fayaz.rf · August 7, 2024, 7:11am

To Reinforce the importance of safeguarding and child protection.

Reviewing the fundamental principles and practices of safeguarding.

Updating children on any new policies or procedures.

Providing a safe space for children to ask questions and express concerns.

Enhancing children’s understanding of their rights and how to protect themselves and others.

Key Activities:

Preparation: Develop session materials, including interactive activities and informational handouts.

Coordination: Schedule the session in consultation with ALP program coordinators to ensure maximum attendance.

Implementation: Conduct the session using engaging and age-appropriate methods to ensure children understand the content.

Follow-Up: Gather feedback from participants to assess the effectiveness of the session and identify areas for improvement.

Timeline: The session will be conducted within the next month, with exact dates to be finalized based on the availability of the children and staff.

Responsible Persons: [Safeguarding Officer’s Name], [RI Staff Member’s Name] rephrase

ar1709146 · August 18, 2024, 5:48am

do i need to give you my youtube login information.

ar1709146 · August 22, 2024, 3:52pm

‘’Hello! I am a seasoned social media strategist with over 03 years of experience in driving brand growth and engagement.I specialize in facebook,instagram,youtube and linkedin.I

help businesses target the right audience,optimized ad spend and achieve measurable

Results.whether you are lanching a new product or looking to increase brand awareness.

I design ad strategies that align with your business goals and deliver real impact.Let’s get

started on your next successful campaign!’’

waaaseee · January 17, 2025, 9:33pm

Hey, I was wondering if you are sending just one prompt to the API or several and noticing a difference? I faced the same problem, and fixed it by storing the message history internally and sending it upon every new message - that greatly GREATLY improved the results.

A little bit of context is that the API DOES NOT store the history of the conversation and each chat request is as stateless as HTTP. Hence, you need to maintain a history of the chat internally yourself - and resend it with every request.

Once you do that the response on the web and api version should look similar using the defaults API settings - it did at least in my use case.

Topic		Replies	Views
Differences in results between API calls content results and browser content results gpt4 API	33	12061	March 2, 2025
How to prevent ChatGPT from answering questions that are outside the scope of the provided context in the SYSTEM role message? API	53	178948	December 2, 2023
GPT4 output essentially useless and grammatically incorrect and meaningless API gpt-4 , api	11	2023	December 19, 2023
How to keep session with gpt-3.5-turbo api? API	37	49683	December 12, 2023
How to pass conversation history back to the API API chat-completion	14	39078	April 1, 2024

Different results: ChatGPT3.5 vs API (gpt-3.5-turbo)

Related topics