As @anon22939549 points out file size and token count are the main factors to consider. In this case it’s a bit tricky because the base solution to count tokens expects a text string as input.
In general you can expect PDF files to cause additional issues with retrieval. If you can provide the knowledge input as a text file your results will likely be better and you can check the token counts yourself using tiktokken.