Our project works with US regulations.
121k docs in total, and out of 121k doc, the 5k docs are for immediate processing (urgent).
Some of the regs are pretty long, longer than 8k tokens, available in GPT4 API.
To handle long regs we tried different algorithms to split the docs, but with every split the context of 2+ reg section gets lost and processing accuracy degrades.
Badly need 128k tokens GPT4.
Update 1hr later:
I’ve got it. Testing …
Just needed some patience. )
Update2:
GPT4 120k needs adjustments (see screenshot below);
Looks like the bug is already reported, no need to bother OpenAI Dev team;