You’ll have to compress the prompt by sending smaller pieces and asking the model to summarize each piece, before you send the total of the summarized pieces in for final inference.
You’ll have to compress the prompt by sending smaller pieces and asking the model to summarize each piece, before you send the total of the summarized pieces in for final inference.