Plugin Response Token Limit

Hello! I’m running into issues around the token limit when developing my plugin.

Context:
My plugin retrieves a user’s information and returns it to chatGPT for analysis.

The problem:
The problem is that after a certain size, ChatGPT just ignores the response entirely and hallucinates a response.

Questions:
What is the token limit for the api response? It’s clearly much smaller than the 13000 token limit described elsewhere.

What can I do to get around this issue?

What kind of information are you passing and returning? Large lists?

Yes - it’s a retrieval plugin. The plugin returns a large list of financial information.

Is there no filtering you can perform? 13,000 tokens is quite a lot. The Wikipedia entry for tokenization only uses 2.6k tokens.

Even if it was able to produce that much information, the user would run out of their daily limit and eventually not use the plugin.

The other option is to simply use the API. The 32k token limit could be ideal once access is available. I really don’t think this is a “workaround” issue, though.

For the token limit for ChatGPT. Who knows. As you’ve noticed, in the background it performs truncation techniques and quite possibly token-saving techniques that we aren’t made aware of. Again, you can use the API if you want to have more control.

One workaround that popped in my head is having ChatGPT serve a temporary URL which contains the content, rather than output it.

So I’m not hitting the token limit, it appears that ChatGPT doesn’t parse plugin responses once it hits like 900 tokens or something.

My current workaround is to not respond with JSON, but rather, a comma-separated list, which ChatGPT seems to have an easier time parsing.

Right. What I’m trying to say that it’s not necessarily the hard token limit that you’re hitting. It can be, and usually is proactively truncated.

The answer is still obvious: using way too many tokens in your response.

There may be other answers for you, but that’s all I have. Surely you can decrease the amount of data in the response? Is every little bit of data returned actually necessary? Can you narrow your results?

It seems like you are using GPT to create queries for some sort of database. Why not just have it create the query, and then execute it elsewhere? Why does GPT need to output the response?

It’s hard to give any more detailed information when all known is: You are retrieving a large amount of financial information.