Does ChatGPT make mistakes more when we use numbers?

Disclaimer: I am not a native English speaker, so please forgive me using strange English for native English speakers :man_bowing:

Hi. I am currently making a GPTs that can judge people’s resumes.

According to OpenAI’s official, they say ChatGPT (I mean GPT-4 or latest models) make mistakes at 7-8%. However, when I let ChatGPT to judge people’s experience years or period, they make mistakes at 14-20%.
Eg.
When a resume says “I have an experience of from 8.2021 to 1. 2024”, ChatGPT says this guy has three years experiences. But, it is actually 29 months so it’s not three years of experiences.

Is it because of ChatGPT’s capability of math or my prompting?

1 Like

The math is indeed wrong here, but it can be fixed via prompting. You can tell ChatGPT to accurately count the months as well instead of rounding up the years. Another possible cause is the date formats where the resume could be using formats that are not standard.

Also, how did you provide the resume? If it is by image then maybe it was not able to read the month and could only see the year - possibly due to image quality or OCR issues.

Thank you @mendoza.migz

I am currently working for a company in Japan and resumes tend to be written either in Japanese or English.

  • The date format
    So, there are mainly multiple types of date format like yyyy/mm/dd, dd/mm/yyyy, MM, DD, YYYY, and yyyy年mm月 to name a few.
    So I am thinking to make two GPTs for Japanese resumes and English resumes to make this problem simple because Japanese candidate never makes dd/mm/yyyy format in Japanese resume.

  • How should I tell ChatGPT to count the months
    I already told ChatGPT that a year is equal 12 months and calculate experiences by months.

  • The format of resume
    I provide the resume as text to ChatGPT. I do not use image.

1 Like

You mostly documented the unique data formats. You could put instruction such as “if language==Japanese: use date extraction patterns #2

Then the AI having the MM-DD-YYYY, the easiest method for calculation is to tell the AI what kind of python code it should write and execute in code interpreter to obtain distances between dates instead of just answering.

I understand. I think there is no issue with the date format as I have quickly tested it on ChatGPT-3.5 and the problem seems to be with the calculation. One quick solution I found was telling it to calculate the experience by months and then convert it to years and months after.

Thanks for posting this question and how you worked through it. One thing I learned and see here is the need ensure your prompt includes the necessary steps that the AI should do. Also I’ve seen better success telling it to work each step in order and do not skip.

This ensured the AI processed each thing in sequential order which seemed to increase some level of efficiency when doing certain task.

1 Like

I’ve done hundreds of tests of the ability of the LLMs to solve basic algebra and math problems. I’ve seen a significant improvement in accuracy if the prompt includes an instruction to test EACH step of calculations. And, with the LLMs probabilistic approach to algebra, the most common mistake is moving terms from one side of an equation to the other.
For 100% accuracy on math you can instruct the LLM to use the code interpreter, which usually invokes Python and SymPy.

1 Like

Most of this is due to your ambiguous requests.

When asking how much experience someone has in a field, the vast majority of usage documented on the internet will simply look at the years and give the answer of 3. You then expect the answer to be in months. Please tell the AI to only use months to calculate this number.

When I look at those dates one from 2021 and once from 2024 I would say 3 years, I have never been in an interview or conducted an interview where months were counted, perhaps your use case requires this, if so, you must specify the granularity required.

1 Like

Thank you all of you replied.
I figured out how to improve by improving followings in both English and Japanese resume;

  • Calculate everything with months: I told ChatGPT is that a year is equal to 12 month.
  • Make all instructions more clear: Eg) If date format is DD/MM/YYYY, please follow #1, if date format is YYYY/MM/DD, please follow #2
  • Ignore DD part.
  • Show examples with all date format.
  • If the word 現在 (current) appears in resume, get today’s date.
    Thank you all again.

Yet this does not mean my GPT’s answer is always correct in certain situation like a person who has two job title or position in the same period. I’m gonna seek the way to improve and get 100%.

1 Like

Hello @katakuma4625 . I would be very interested in trying out your GPT, as I am also currently engaged in recruiting and this topic is very relevant to me. Are you considering the possibility of sharing a link to your GPT?

Hi @mmashigarami . Deep apologize for the late reply.
My prompt contains a lot of confidential information so I am not allowed to share the prompt or link to my GPT.
But here’s the brief structure of mine.

  • Brief introduction of role of GPT
  • Task list
  • Important information
  • Background information
  • How to structure the result
  • Criteria of resume
    This part contains roles to measure experience period
  • Resume

I hope this helps you.

1 Like

@katakuma4625 I hope this message finds you well. I am reaching out to seek advice and insights from you and the forum members regarding a delicate situation I am currently facing:

Summary

As part of a project I am involved in, I have taken upon myself the dual responsibilities of not only creating content, which is my primary role, but also ensuring its successful promotion, which includes marketing efforts. This decision was made to guarantee the project’s success.

However, I am encountering a challenging scenario. In my experience, a significant percentage of individuals who possess impressive credentials, such as extensive experience, degrees, and recommendations, regrettably turn out to be, for lack of a better term, untrustworthy. I apologize if this terminology seems harsh; it is not intended to be derogatory but rather a colloquial expression in my native language.

In response to this challenge, the hiring process I have implemented (custom GPTs) focuses solely on the candidate’s understanding and knowledge of the field. This includes their theoretical grasp, ability to solve practical problems in real-time, eagerness for learning, and active participation in relevant forums. The background of the candidate is of secondary importance to me. Furthermore, these interviews are recorded and analyzed by a highly recognized expert in the field. This expert is only involved as a consultant due to budget constraints.

I am seeking advice on how to navigate this situation effectively, especially considering the prevalent issue of encountering untrustworthy candidates with seemingly impeccable credentials. Any guidance or shared experiences in handling similar challenges would be greatly appreciated.

Thank you for your time and consideration.

@mmashigarami Hi. Sorry for the late reply.
I was facing the exact the same challenge before solving the math issue. I am not sure your role in detail by given information but I assume that you are checking resumes of candidate and having job interviews with them. Is it correct?
If it is correct, I can give you some tips to make better result of chatGPT.

Here are things I did to let chatGPT judge candidates.

  • Try to define in detail
    E.g:) If your position require three years experience of engineering, try to define in precise like;
    Three years experience of using Ruby and Go.
    Three years experience of using cloud database like GCP, AWS, Azure.
    Working as a freelance does not be seen as a working experience.

Like that. Ambiguity always causes unwanted results.

  • Make the point system
    I am currently using the point system to judge candidate with ChatGPT. I set four category, Minimum, High, Low, and NG, for each criteria

Minimum: Candidates must meet
NG: If candidates meet at least one of these, we do not hire them.

Candidates who meet all Minimum requirements are ranked within candidates according to the number of Highs they have acquired. However, their points are deducted for each Low they meet. By establishing such a point system, it is possible to distinguish between candidates who have passed the selection process.

  • Give information as much as you can

You can give ChatGPT job description, your role, instruction of how to judge candidates, other (might) helpful information.

I hope this helps you.

1 Like