Document Ingestion EndPoint

keith9 · April 10, 2024, 6:42pm

A use case requires uploading a Word document that is related to a Chat Completion query. Can I upload office-format documents to gpt-4 model using the API? If so, what is the endpoint?

_j · April 10, 2024, 6:48pm

The only input to AI models on the chat completions endpoint is tokens contained in messages format. Other than plain text, the only other “tokens” understood are images on multimodal models.

You must perform your own document extraction to plain text, and then if necessary, reduce the input through document chunking and indexed or semantic search, which can be automatic based on user input or application, or can be a manual search done by AI.

keith9 · April 10, 2024, 6:57pm

Thanks for your reply. Are there other AI models that support uploading an MS word document directly to an endpoint?

_j · April 10, 2024, 7:16pm

The Assistants framework, with which you must delegate control of conversations completely and in which an autonomous agent can make multiple inefficient calls to search for knowledge and has potentially irrelevant documents always placed into context, has file upload with document extraction.

I would look into your own document parsing engine or another API’s, or simply “save as” text if it is your own documents that must inform the AI.

Then you’ll need to use more clever or manual techniques if it is more input data than can be provided within an AI model’s context length.

keith9 · April 13, 2024, 12:51am

Hello, and thanks for the pointers. As a newbie this kind of help saves tons of time and is much appreciated.

Topic		Replies	Views
How can I upload documents API chatgpt	1	161	January 13, 2025
Gpt-4-vision-preview model for other document types not just images API	6	1686	January 13, 2024
CHAT-GPT Search API For Document Upload API	8	29824	December 12, 2023
Extracting Data from ChatGPT API Without Python – Alternatives for SAP Integration? API api	2	68	January 30, 2025
Uploading a documentation pdf API chatgpt	7	8604	September 12, 2024

Document Ingestion EndPoint

Related topics