The precise parameters values does chatGPT3.5 use?

Hi there,
I’m experimenting with the ChatGPT API and trying to replicate it in my development
(Javbascript) environment. I’m getting responses, but the answers are not relevant, random, or making too much sense.
I have tried different combinations of parameters’ values, and while some make it better than others, none so far has given me the exact consistency or even near the consistency of chatGPT itself.
I’m using the Davinci model (/v1/engines/davinci/completions) with the following parameters:
prompt: prompt (a variable that is set by the user on the frontend form)
max_tokens: 150
temperature: 0.6,
top_p: 0.9,
n: 1,
frequency_penalty: 0.5,
presence_penalty: 0.3

I’m also filtering the short lines with: .trim() function.

Any hint on what the precise values are for those parameters? Also, if I’m missing something,

Regards

Davinci is an older model. ChatGPT runs on GPT3.5/4. See the GPT Guide in API docs for how to get started with GPT3.5

2 Likes

There’s three types of popular models here.
Chat: gpt-3.5-turbo and gpt-4. It’s iterative, it has a system box where you can steer the tone of the conversation. It’s built on top of instruct types.
Completion-instruct: text-davinci-n and the like. This is also similar to ChatGPT in that you ask for something and it returns that.
Completion: davinci, curie, and such. These are amazingly powerful but difficult to use. You have a lot more control over them, but they act in the form of autocomplete. Instead of prompts like “Give me some ideas for a baby girl name”, you’d say something like “Here’s a list of baby girl names: 1. Ayesha 2. Natasha 3.” and then it autocompletes.

You can use text-davinci-003 instead of gpt-3.5-turbo, but it’s more expensive, slightly slower, and has better quality. gpt-3.5-turbo is the more similar one.

I appreciate your reply. I will look into that and give it a try.
thank you

I appreciate your reply. I will look into that and give it a try.
thank you

Hi, Thank you so much. Your tips and guides helped me build the app successfully.
there is one concern though…
the content generation process takes around 20 seconds,
i know the max_tokens can control the speed of the generation, but in my case, I’m using max_tokens = 400 to generate about 260 words, which is not much at all.
Using less value doesn’t complete the answer.

How can i improve the generation time please?

Thank you

Hi all,
regarding my last post:

i’m using the chat gpt-3.5-turbo model
is it generally slow?
is changing the model going to speed up the result?

GPT is fairly “slow” compared to any other APIs you may be used to.

You can see community benchmarks here: https://www.gptstat.us

20s for GPT3.5 is not abnormal, and GPT4 is even slower. Plan your application accordingly.

2 Likes

Interesting data. gpt-3.5-turbo used to be faster than davinci when it first came out

Does that mean chatGPT-4 is the slowest of all?

Can someone tell the difference in response time between the 3.5-turbo versions, please?
gpt-3.5-turbo
gpt-3.5-turbo-0301
gpt-3.5-turbo-0613
gpt-3.5-turbo-16k
gpt-3.5-turbo-16k-0613

Yes GPT-4 is the slowest. Response times vary from day to day, from prompt to prompt, and from request to request. Best to benchmark each model with your specific prompt on your hardware, both for speed and for quality of response for your use-case.

Here’s additional benchmarks:

2 Likes

Don’t use davinci
Use this

GNU nano 6.2 curl
curl https://api.openai.com/v1/chat/completions
-H “Content-Type: application/json”
-H "Authorization: Bearer $(cat /etc/keys/key.txt) "
-d ‘{
“model”: “gpt-3.5-turbo”,
“messages”: [
{
“role”: “system”,
“content”: "Talkeetna is in alaska "
},
{
“role”: “assistant”,
“content”: "Where is talkeetna "
},
{
“role”: “assistant”,
“content”: "Talkeetna is a small town located in Alask>
}
],
“temperature”: 1,
“max_tokens”: 256,
“top_p”: 1,
“frequency_penalty”: 0,
“presence_penalty”: 0
}’