For both API chat and assistant how does the input works when giving the book to later on be analyzed ? Simply, where do I copy past the text of the book in both situations ?
I know that gpt 4o as a context window of 128,000 tokens.
I want to reduce the cost, so it seems the API “chat” would be better ?
When using the API “chat”, where and how do I inject all the text of a book ? Which will always be less than 100k token.
I read the following, but it’s confusing me:
“The Assistants API provides File Search (tool) capabilities. File Search parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries. The Chat completion API does not.”
“However, the Chat API is more powerful if you are willing to implement some of the functionality yourself (or by using libraries like Langchain). You will have alot more control over the whole process. For instance, you can control how you want your documents to be chunked and the chunk length for the embeddings part, and so on.”