API "chat" vs "assistant" to analyze books

For both API chat and assistant how does the input works when giving the book to later on be analyzed ? Simply, where do I copy past the text of the book in both situations ?

I know that gpt 4o as a context window of 128,000 tokens.

I want to reduce the cost, so it seems the API “chat” would be better ?
When using the API “chat”, where and how do I inject all the text of a book ? Which will always be less than 100k token.

I read the following, but it’s confusing me:
“The Assistants API provides File Search (tool) capabilities. File Search parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries. The Chat completion API does not.”
“However, the Chat API is more powerful if you are willing to implement some of the functionality yourself (or by using libraries like Langchain). You will have alot more control over the whole process. For instance, you can control how you want your documents to be chunked and the chunk length for the embeddings part, and so on.”

I will assume that by Chat API you mean ChatCompletion APIs. You can use either these. If you are use Chat Completion, you’ll have to put your ‘Book’ in the Prompt Messages. If you use Assistant APIs, you’ll have to upload the Book to OpenAI, add that file to vector storage and and use it in threads.

If you just want to analyze the book for 1 time conversation, let’s say you give Model the Book and it creates a summary of it. That’s it. So use ChatCompletion APIs, if you want a back and forth conversation between you and Assistant regarding your book, use Assistant APIs.