It seems that when we submit PDFs to ChatGPT is has two modes:
- If PDF is very large: PDF gets chunked and the LLM receives a `file_search` tool for retrieval.
- If PDF is small: The entire content gets stuffed on the message context.
I want to achieve the same result using the Responses API. However, it is hard for us to estimate how large (how many tokens) a PDF is: because it would require us to extract the text and the images.
Has anyone implemented this? Any recommendations?
https://platform.openai.com/docs/guides/pdf-files
Note: the Supported models section needs updating.
I read that before posting, obviously.
You can try using the input token count endpoint.
It will return something like:
InputTokenCountResponse(input_tokens=402183, object='response.input_tokens')
It will give you an estimate of how many input tokens will be consumed for the given prompt.
Then, you can decide to transform it into a vector store or something else if it is above a certain amount.
2 Likes
That seems like a promising path but I have a couple of questions on how that works:
If I give the conversation argument then the input is added to the conversation — and then I can no longer decide not to add the PDF to the conversation?
If I don’t give the conversation argument does that endpoint charge me for the input tokens, and then I have to pay again if I do decide to give the PDF? Or is that endpoint free?
It is free, as far as I know (as long as there is no abuse).
It won’t affect the conversation, it is basically a request simulation. Even the parameters are basically the same as a response request, and you can measure other types of inputs too.
1 Like
Awesome. Will use this. Thanks!
2 Likes