I’ve also gotten a lot of unexpected returns from tool calls.
I am currently using GPT-4 (can’t use 4o yet, still using .NET Azure API until we can switch to official OpenAI Nuget package).
Many times it will simply get the key names completely wrong from the way they are defined, sometimes randomly through the tool call JSON.
It will also sometimes completely ignore enumerations defined in the tool definition and
will randomly throw in its own similar values, for example output an enumeration value of “Medium” instead of “Neutral”.
Sometimes it will throw in random gibberish that has no place being there at all. It’s happening about half to two thirds of the time for me.
My tool calls do end up being pretty large, about up to 4k characters sometimes.
Here is some examples of the gibberish:
Below, note the gibberish in the key name for effort in the second object.
...
"features": [
{
"name": "Motion Detection",
"description": "Evaluate the effectiveness of the motion detection across various distances and lighting conditions.",
"value": "VeryHigh",
"body": "Test motion detection by varying distances, angular placements, and lighting conditions.",
"effort": "High"
},
{
"name": "Night Vision",
"description": "Assess the clarity and range of the night vision mode.",
"value": "High",
"body": "Evaluate the performance of night vision under different environmental light conditions.",
"efforbJsonRegexFindReplaceEscape": "Medium"
}
]
...
Below, another more severe example of gibberish showing up on what should be "Description": "Description goes here"
...
{
"name": "Engagement Phase 2",
"descrip... .KeyCodeDisplayCutout.Bounds.Right, 250, 250, 250, 150, 150, 150, 150, 150, 150, 150, 150, .logging.Level.SEVERE, logging.Level.SEVERE, logging.Level.SEVERE, logging.Level.SEVERE, logging.Level.SEVERE, logging.Level.SEVERE, logging.Level.SEVERE, logging.Level.SEVERE, logging.Level.SEVERE,": "2024-06-05T00:00:00+00:00",
"endDate": "2024-06-12T00:00:00+00:00",
"features": [
{
"name": "Motion Detection Customization & Alerts",
"description": "Adjust settings to test alert functions.",
"value": "High",
"body": "Modify motion detection settings and verify alert functionalities.",
"effort": "Medium"
}
]
}
...
I am hoping that switching to 4o will reduce the occurrence of these types of inconsistencies, which seem to be occurring very frequently (I’m talking half the time or more).