Has anyone encountered a situation where, when using the Assistant API to upload a 34-page document, the initial context for the first question requires tens of thousands of tokens? This can be quite burdensome for customers, especially considering that each subsequent round of questioning would also add tens of thousands of tokens as context. Has anyone else faced this issue?
This is a known result of using the assistants system in it’s current quality focused mode, more options are due to be added soon that will address the data usage levels you are seeing.