Agreed,
I think embeddings will still have their place for most companies because of the optimization they bring. But it’s still amazing for chatbot applications that uses a sliding context window. Especially for more complicated tasks. I often have to copy-paste context from earlier in the conversation because it slides out of the 8k context window, and when I do that it pushes out more context.
Totally agree here again, AI agents can also create large amounts of instruction and contextual data that doesn’t need to be shown to the user. but they tend to go loose track of the original task if you allow them to consume too many tokens again because the original context gets pushed out of context window.