You could do it… But you would have to write some complex code that would detect… Is this code… If yes, then use modelA… If no, use ModelB.
The easier way to do it is to gather any models you want… Then combine them. (Bloom, GPT2, Etc.) Then you would be able to have a model that did what you want. Although, keep in mind that these models take a LOT of GPU power to enable them to be so quick. Even if you had 64 gigs of CPU memory… It’s going to be worse than a GPU with 16 gigs. So, if you want a local ChatGPT for your own use you’re better off waiting for Stability to come out with theirs unless you enjoy coding and tinkering. (I do, but I’m a programmer not a data scientist…) So, even now, I’m still wondering if a localized version of ChatGPT would even be worth it hardware wise.
But if you figure it out. Let us know.