Hi there,
I’m experimenting with the ChatGPT API and trying to replicate it in my development
(Javbascript) environment. I’m getting responses, but the answers are not relevant, random, or making too much sense.
I have tried different combinations of parameters’ values, and while some make it better than others, none so far has given me the exact consistency or even near the consistency of chatGPT itself.
I’m using the Davinci model (/v1/engines/davinci/completions) with the following parameters:
prompt: prompt (a variable that is set by the user on the frontend form)
max_tokens: 150
temperature: 0.6,
top_p: 0.9,
n: 1,
frequency_penalty: 0.5,
presence_penalty: 0.3
I’m also filtering the short lines with: .trim() function.
Any hint on what the precise values are for those parameters? Also, if I’m missing something,
There’s three types of popular models here.
Chat: gpt-3.5-turbo and gpt-4. It’s iterative, it has a system box where you can steer the tone of the conversation. It’s built on top of instruct types.
Completion-instruct: text-davinci-n and the like. This is also similar to ChatGPT in that you ask for something and it returns that.
Completion: davinci, curie, and such. These are amazingly powerful but difficult to use. You have a lot more control over them, but they act in the form of autocomplete. Instead of prompts like “Give me some ideas for a baby girl name”, you’d say something like “Here’s a list of baby girl names: 1. Ayesha 2. Natasha 3.” and then it autocompletes.
You can use text-davinci-003 instead of gpt-3.5-turbo, but it’s more expensive, slightly slower, and has better quality. gpt-3.5-turbo is the more similar one.
Hi, Thank you so much. Your tips and guides helped me build the app successfully.
there is one concern though…
the content generation process takes around 20 seconds,
i know the max_tokens can control the speed of the generation, but in my case, I’m using max_tokens = 400 to generate about 260 words, which is not much at all.
Using less value doesn’t complete the answer.
Can someone tell the difference in response time between the 3.5-turbo versions, please?
gpt-3.5-turbo
gpt-3.5-turbo-0301
gpt-3.5-turbo-0613
gpt-3.5-turbo-16k
gpt-3.5-turbo-16k-0613
Yes GPT-4 is the slowest. Response times vary from day to day, from prompt to prompt, and from request to request. Best to benchmark each model with your specific prompt on your hardware, both for speed and for quality of response for your use-case.