⬛ Splitting / Chunking Large input text for Summarisation (greater than 4096 tokens....)

Amazing - checking it out now! Thank you.

Edit - it’s similar to the approach I’ve been taking. I think the main difference here is that summarising a novel you can sacrifice detail. When analysing certain documents however, let’s say an RFP - you don’t want to lose certain details such as ‘Budget’ and ‘Deadlines’. I’m sure GPT-3 can handle this is given the right approach. For any input less than the token limit one shot is enough.

@daveshapautomator have you experimented with more structured extraction of features?

Edit 2 Textwrap lib seems like it can help with chunking and whitespace processing.

Edit 3 @daveshapautomator I think some of what you talk about here will also help, in your example where you extract medical information and prognosis How to prevent Open AI from making up an answer

4 Likes