Maintaining memory of previous messages in threads and running jobs natively on OpenAI servers is so useful. With this, I can run different batches of jobs to assemble a dynamic ‘chain-of-X’ in the background on OpenAI servers without re-sending the same text.
GPT4 models seem to get in the way of prompt engineering that I am using to assemble precise chain-of-X context prompts. In other words, GPT4 interferes by doing too much for me that I’m trying to do myself in a specific way.
I’d like to downshift to GPT3 while still being able to have Assistants API functional for just Threads, Messages and Runs. I do not even need to use functions or files. If I need that, I can send specifically only those use cases to GPT4 or whatever endpoint.
But as it currently stands, I cannot use both GPT3 and the Assistant API it seems. Is this true? I’m not sure because when my run never completes under GPT3 models. I assume it’s because I didn’t use “gpt-4-1106-preview” and thus it isn’t compatible with the new API? Or maybe I have a bug in my code?
So is Assistant API going to be compatible with GPT3 going forward? Or alternatively, would there be a way to “downshift” out of GPT4 so that I get more “raw” output that produces more consistent completions with one-shot/few-shot prompts?
It’s difficult to engineer a solution that relies on heavily nested prompt engineering if those outputs are muddied by GPT4’s good intentions to get directly to the user’s answer in a one-shot. I’ll get there myself— eventually. But I may need to do RAG or other tools in the middle to verify or branch into other threads along the way.