Files.list() limitations?

Two great ideas there, thanks.

I have been doing a ‘raptor’ ish thing with long docs where i include a ~800 token llm-generated summary of them at the start of the doc so that the broader concepts can get picked up as chunks, but throwing in at least the document title every 800 tokens is a great idea.

Regarding parsing pdfs into md, try llamaparse if you haven’t bo1. There are also some open source libraries around to do this. If you are hitting openai limits then this could save you 300$ a month in storage haha.