How does gpt-4-1-mini compare to gpt-4o-mini for selecting which function to call and setting the selected function’s parameters?
I’ve noticed that gpt-4o-mini may confabulate an argument that does not exist or try to set its value with an improper type. I’ve also noticed that it will often pick the wrong tool when two tools are too similar. I’m curious how the current slate of API models rank according to tool handling.