How can ChatGPT do both text and code at the same time?

rami10000 · February 18, 2023, 11:28pm

Hi there, I’m trying to build an app that can do both Text and Code in the same way ChatGPT does. But text-davinci and code-davinci are 2 separate models. How does ChatGPT know the nature of the user query to fire up the relevant engine? How can I detect the nature of the query in the same way?

raymonddavey · February 18, 2023, 11:36pm

The short answer is that you can’t

Davinci is OK at doing some tasks related to code though.

Maybe you could offer the user a “mode” switch to allow them to pick the best engine?

rami10000 · February 19, 2023, 12:02am

@raymonddavey but how is it that ChatGPT can do it? Any ideas?

raymonddavey · February 19, 2023, 12:24am

Chatgpt uses a later version of Davinci. There is no API for it yet

It does switch models on the fly (as far as I know) so it gets all it’s answers from Davinci. This includes the code ones

We don’t have access to the same engine, but Davinci 003 does a fairly good job anyway

rami10000 · February 19, 2023, 1:41am

No, according to the documentation on OpenAI’s website, it uses the same text-Davinci-003 but fine-tuned a bit. Text-Davinci-003 and Code-Davinci-002 are part of the GPT3.5 collection. We already have access to the models powering ChatGPT. I’m just not sure how they’re synchronising the different models so seamlessly.

If anyone has ideas, I’d really appreciate any insight.

ruby_coder · February 19, 2023, 2:39am

Hi @rami10000

Could you please share this exact reference?

According that what I have read about ChatGPT, the architecture is more in line to what @raymonddavey mentioned in that ChatGPT uses multiple models, which includes both davinci and a codex.

Would be interesting to see the reference where you backup your “No” reply.

Thanks!

raymonddavey · February 19, 2023, 2:43am

Chatgpt is based on Davinci 3.5

It a model we don’t have access to in the API

It combines several models into 1. Refer to the link above for detailed breakdown of the models

3.5 is a single model so there is no need for chatgpt to swap models on the fly

ruby_coder · February 19, 2023, 2:54am

Excellent reference!

This about sums it up from the reference kindly provided by @raymonddavey:

GPT-3.5 series is a series of models that was trained on a blend of text and code from before Q4 2021. The following models are in the GPT-3.5 series:

code-davinci-002 is a base model, so good for pure code-completion tasks
text-davinci-002 is an InstructGPT model based on code-davinci-002
text-davinci-003 is an improvement on text-davinci-002

rami10000 · February 19, 2023, 2:59am

“ ChatGPT is fine-tuned from a model in the GPT-3.5 series, which finished training in early 2022. You can learn more about the 3.5 series [here]”

And here are the models included in GPT3.5:

As you can see it includes text-Davinci-003 and code-Davinci-002 both of which are accessible via API.

So in theory, we should be able to build our own ChatGPT via these existing models if we can fine-tune them in the same way.

ruby_coder · February 19, 2023, 3:01am

Yes, but it is highly unlikely you can “fine-tune them in the same way”, especially since you have no idea how it was fine-tuned by OpenAI and that information is proprietary.

rami10000 · February 19, 2023, 3:02am

Hence my original question in this thread to see if anyone in the community has figured it out yet.

ruby_coder · February 19, 2023, 3:10am

Understood; and I think @raymonddavey answered correctly:

In my view, you would you need a huge amount of computing resources, a small fortune of money and a team of top data scientists to build ChatGPT as you wish, so from an engineering perspective, you are probably “better off” to wait until the"coming soon" ChatGPT API is released if you want to build apps which mimic ChatGPT functionality.

I agree with you @rami10000 that ChatGPT more-than-likely does not “switch” between models of course and they surely integrate the models, but how they perform this integration is anyones guess.

If you find out, please post back and enlighten all of us.

Personally, I’m waiting for the ChatGPT API to be released.

Thanks

curt.kennedy · February 19, 2023, 3:35am

You can build smaller versions of ChatGPT on your own now, here from the father of AI at Tesla nonetheless!

raymonddavey · February 19, 2023, 4:37am

@curt.kennedy BEST Link EVER !!

ruby_coder · February 19, 2023, 6:48am

Hey @curt.kennedy

Went though the video, and I think it’s a very long (impossible) stretch to say that the “baby-steps, learn to crawl” with GPT video is anywhere close to ChatGPT, which costs millions and millions of USD to develop.

Don’t get me wrong, that’s a nice tutorial; but it is very, very far from even being a “smaller version of ChatGPT”. It just a “first steps” with GPT tutorial which, I am quite sure, @rami10000 will not be able to get the ChatGPT functionality he is looking for from starting GPT from scratch!

It would take many years and many millions of dollars to get close to where OpenAI is with ChatGPT, starting from scratch like that. Otherwise, there would be no need for OpenAI.

patrick.samy · February 22, 2023, 11:13pm

To your initial problem, a common way to achieve this is to:

Use text-davinci-003 with a prompt designed to analyze the semantic of the user query
Select generate a completion with the appropriate model and parameters based on the response.

Prompt example:

Is the query below about generating code? Answer with yes/no.

Query: ${query}

Answer:

If the answer is yes, generate the completion for the query with code-davinci-002, otherwise with text-davinci-003.

rami10000 · February 23, 2023, 12:40am

Thanks Patrick, I had thought of this solution already. But it means we’re duplicating the number of tokens/cost with every query. So I wondered if there’s a different way ChatGPT is doing it.

Do you think ChatGPT makes a semantic assessment of every query and then fire it to the relevant engine for completions?

juanluisgarrido · February 23, 2023, 1:11am

but please remove the locks on the amount of code the chatgpt can respond to and the characters it is impossible to debug or consider using it as an ai tool if it has that limitation it is impressive how it can assist in error checking and help write the code but if when one asks for a response and it cuts in half and then cannot follow the context sometimes it does wonderfully and sometimes it does not it is a tool with very big flaws to help us coders.

patrick.samy · February 23, 2023, 3:15pm

Gotcha! Indeed, it does mean duplicating tokens.

I’m not sure what ChatGPT does, although I remember Sam Altman said in an interview recently that building it was fairly simple and he was surprised no one had built something like ChatGPT since the release of the API and models. That being said, I have no other insights on this and the consensus from the replies above seems to be that it’s quite complex and costly.

In my application, I use a much shorter prompt for the semantic analysis compared to the answer generation, which means the added cost is negligible.

But if you are trying to optimize for cost, I’d consider performing your semantic assessment with:

Cheaper models (ie. Ada/Curie, but requires more prompt engineering for good results)
Open AI competitors, or self-hosted open source pre-trained models (plenty on Hugging Face)
An NLP API, such as Google Natural Language, Amazon Lex, or Wit.ai (free)

rami10000 · February 23, 2023, 5:01pm

That’s great food for thought, thanks Patrick! I’ll check out the other models and see how well they can semantically identify a query.

Topic		Replies	Views
How to clip "bubble wrap" from the end of responses? Prompting	18	1425	March 22, 2023
Determining if the user has changed a subject Prompting	11	2253	March 28, 2023
How to I get API to produce the same output as ChatGPT? API	20	20584	December 17, 2023
Fine Tuning text-davinci-003 Models Prompting	10	5623	March 26, 2024
Teaching GPT the information it will be working on API gpt-4 , assistants	8	2272	November 19, 2023

How can ChatGPT do both text and code at the same time?

Related topics