ChatGPT Plugins Context Length

What are the Context Lengths for using ChatGPT plugins? What happens when you go over these limits, especially in the API responses?

In my testing, ChatGPT will start randomly hallucinating, after an API request, and it’s clear that some context window has been met, but it’s not clear why.

Would be great if there were better debugging tools to figure out what is chewing up the context.


Set your temperature to 0.10 or less to limit hallucinations internally, for the response from the ChatGPT interface, add prompts to help guide the output.

1 Like

I’m referring to ChatGPT Plugins, not the completion interface. You can’t set the temperature in Plugins as far as I can tell.

Generally prompts control the hallucinations and guide responses. It sounds like your responses are raw data with no context. Structured NLP prompt can help here.

From my experience using other plugins, ChatGPT does “hallucinate” that its knowledge has a cutoff date of Sep 2021, and thinks it does not have access to any plugins, even in the very early rounds of chats.

I’m not sure whether this is the type of “hallucination” you have encountered. My situation definitely strikes into the dilemma where pre-training and fine-tuning conflict with newly introduced functionalities brought by plugins (at least occasionally and depends on your plugin prompt instruction). I wonder very much how OpenAI will approach to solve this.

I’m having this same issue with the plugin I’m developing.

When the plugin queries our API and gets a response that exceeds a certain size, it seems to respond with made up data. I’m sure this is related to context length limitations, but I’m not sure how to proceed considering the response size will be based on the amount of data needed to fulfill the query. Is there a recommended way to handle this?

I’m not sure that the suggestions in this thread related to plugin development specifically. @speedplane were you able to find a workaround?


Any luck with this? Looking for information on how pugin api responses consume context tokens as well.

1 Like

I think it’s 8k, as per GPT-4.

To get around hallucinations, our plugin server tokenizes output responses with Tiktoken to count the output size and truncate it at slightly less than 8k tokens before getting sent to the client. This way, any other tokens ChatGPT decides to append also doesn’t ruin the output.

Yes with any plugin, my own too, this happens where GPT-4 makes up stuff. Especially with my source code, fully lies to me :D.