Aha! Semantic Chunking! https://youtu.be/w_veb816Asg?si=fKodYwnF1rupHr0V
Absolutely. Not only case by case, but sometimes document by document. I have multiple custom embedding schemes within one dataset because different documents sometimes require different embedding methods: sometimes I add a summary of the entire document to each chunk, sometimes I summarize each individual chunk, sometimes I just place a overall description in each chunk, and sometimes I don’t add anything because the title and other metadata suffices.