Based on my experience and research I would recommend to:
- switch to markdown formatting (attention, when I last tried, .md files were not supported by the automatic retrieval, but .txt files with markdown formatting work fine)
- Test your formatting. Sometimes headings are cut-off, other times list won’t work. I recommend creating an assistant in the playground, there you can check the logs and see if the expected parts are returned
- “annotate” your information. while some abstraction is possible (e.g. the retrieval might return “something red” when you search for “something colored”), it has its limits. Think about what and how users might ask and ask yourself if vector search can accomplish that level of abstraction. If not: maybe add specific tags to the relevant parts in your file
- If you are really serious about this, think about sorting your knowledge files and put often required info to the beginning, this impacts retrieval time for large files.
I call this process Knowledge File Optimisation (KFO) A detailed write-up of my research on this can be found here