Counting is iffy on LLMs for reasons that I won’t go into this post.
In my use case (of the multi agent framework), I absolutely need the exact counts (Assistants API - Access to multiple assistants - #36 by icdev2dev).
IF you REALLY need the exact count, I would suspect that you would be well suited to be very deterministic about it. (the above post mentions how I “occasionally” post the count onto the thread.) In your case, you might want to post the count before hand, IF you need it.
IMO there is no particular need to “slow it down”. There are ways of confirming that is actually reading the file and extracting the meaning; besides “slowing it down”.