⬛ Splitting / Chunking Large input text for Summarisation (greater than 4096 tokens....)

pmshadow · March 14, 2023, 5:32am

Thank you so much for providing these langchain links! Exactly what I needed.
I tried to explain a little bit in layman terms how embeddings work and how they can be used.
I think summarizing everything before “needing them” might be an expensive overkill, as it is significantly more expensive than embeddings.

I am thinkibg about creating “rolling” embeddings with 2k-long overlap, so whenever I detect this “long but interesting document part” I can process only it doing iterations. I will test the approach in the next days

Topic		Replies	Views
Summarizing and extracting structured data from long text Prompting gpt-4 , api , token , limitations	14	12514	February 19, 2024
Best way to create responses that exceed token length Prompting	10	4789	December 17, 2023
New 4-turbo model has a unique limit? Or is this a bizarre hallucation? API	18	4450	January 26, 2024
How do I summarise a block of text larger than the token limit? API	13	9078	December 17, 2023
Poor quality response on trained LLM with pdf files Community gpt-4	29	6055	May 1, 2024

⬛ Splitting / Chunking Large input text for Summarisation (greater than 4096 tokens....)

Related topics