Mini or Nano? 4o or 4.1? RAG MODEL

Based on your experience with chatbots using RAG, which model has worked best for you?
Both when responding and when interpreting the prompt, because I see that if you want to add a new line to the prompt, such as “respond in the language of the user’s question,” it doesn’t take this new rule into account, and it also happens that if you tell it how to act, it takes some things into account, but not others.
The main problem I’m having is that I can’t get it to not refer to the information I give it when it answers me. It always says things like “in the information I have” or “in the documentation provided.”

What have been your conclusions? Thank you very much.

  1. manage your own chat
  2. use your own functions
  3. after a function return message add a system message
  4. describe the needed output behavior
  5. (be OpenAI, do this for web_search and file_search on developer APIs because models fail to follow directions: break tool iterations and developer applications)

If the AI is constantly told by OpenAI before every user message sent “the user has uploaded files”, then RAG applications will also break and the AI will talk to the users.

Solutions: Get off OpenAI internal tools. Get off Responses. Build portable apps that can follow to any developer-friendly AI inference provider that has good AI models, turn-by-turn.