How to handle different ChatGPT answers to the same question?

simonchatgpt3 · November 29, 2023, 12:10pm

Hello,
I am a computerlinguist and I want to compare the grammar abilities of ChatGPT with the grammar abilities of my rule-based parser. For this, I will simply ask ChatGPT some grammar questions to a given sentence, which are in principle questions that can only be answered right or wrong.

But asking the same question several times with a fresh context window, I get answers, which are sometimes wrong and sometimes true.

I think, I am not the first person with that problem. What is the standard to handle this? The simpliest solution would be to ask 20 times the same question and count how often it is wrong or true.

Can someone tell me a paper with the same problem?

vb · November 29, 2023, 12:47pm

Have you tried the Playground where you can control the randomness of the model output via the temperature and top-p settings? It should help you control the variability in your responses.

https://platform.openai.com/playground

Ps. If you want to check the responses to the same question 20 or even a thousand times you can set up an eval for this case:

simonchatgpt3 · November 29, 2023, 1:09pm

Thanks, that helps a lot. I will play with the temperatur value.

Gets ChatGPT deterministic if the temperature is zero?

vb · November 29, 2023, 1:22pm

It will never be fully deterministic until we make some exciting progress on the hardware side. Until then this is all the control we have but it already does help a lot.
There used to be a little pop-up bubble with more information when moving the sliders but I think this has been removed. So, even though it sounds like it can be deterministic, it won’t be.

Also, if the answer is not in your required format you can write a simple script to discard the reply and resend the request. With the API, and the Playground is just a user friendly implementation of the API, there are tons of options available.

EricGT · November 29, 2023, 1:31pm

Out of curiosity care to share which programming language or tool?

I ask because I use Prolog daily and regularly create parsers for data, programming languages and Turing complete languages but not human text.

simonchatgpt3 · November 29, 2023, 1:38pm

Hello EricGT,
the rule-based parser is an own creation, which is not fully published yet. And it is solely for the German language. The parser works with many interpretations of a single sentence and sorts out the false ones by German grammar.

The upcoming paper describes the whole parse process. That is the reason for my question, I want to compare the grammar abilities of LLMs against the parser.

EricGT · November 29, 2023, 2:10pm

This may help you in getting to your need.

The original post noted

In the reply to me is noted a slightly different need

While ChatGPT is based on an LLM it is a chatbot with training beyond just the foundational LLM.

So to me these are two different questions.

With regards to your question:

Can someone tell me a paper with the same problem?

I passed your post to ChatGPT to see if it would list some papers and it did not.

ChatGPT use to have plugins that could be used to search thousands of research papers and give a short list of relevant papers but that is no longer working for me.

My suggestion would be to post a new question asking for just the papers. I use to try to keep up with papers related to LLMs but there are hundreds being released daily and most are not worth the trouble to even read the title.

You should also note your level of expertise for this when asking. Don’t take this wrong but I get the impression you are writing a thesis paper for a degree.

HTH

Topic		Replies	Views
Why does GPT 3.5 do this? Is it following any pattern Prompting chatgpt , prompt , prompt-engineering , prompting	1	638	January 23, 2024
Confusing output for a given prompt Prompting chatgpt	14	1763	December 16, 2023
Why is ChatGPT so moody, refusing to answer the exact same question it answered previously? API	14	4608	March 27, 2023
ChatGPT-like Search functionality - Reliability on question and answers Community gpt-4 , chatgpt , api	5	1441	December 17, 2023
Expand AI Context beyound local documentation GPT builders gpt-4 , chatgpt , chatgpt-plugin	5	747	January 19, 2024

How to handle different ChatGPT answers to the same question?

Related topics