API design doesn't respect Type-Safe Languages

Using API’s with type-safe programming languages is becoming increasingly difficult. As an example, here are two responses from the Chat API. In the message field, one response is a string while the other is a string array. Handling such cases in type-safe languages can be challenging. However, this issue could be easily resolved on the OpenAI side. It’s important to note that type-safe languages do exist and should be considered.

Furthermore, there is currently no documentation available for response models. This lack of documentation can lead to unexpected issues, such as finding that in production messages cannot be mapped because they can be also a string array.

{  "error": {
    "message": [
      "Invalid schema for function 'get_n_day_weather_forecast': In context=('properties', 'format'), array schema missing items"
    ],
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}
{
  "error": {
    "message": "The model `Models.` does not exist",
    "type": "invalid_request_error",
    "param": null,
    "code": "model_not_found"
  }
}
2 Likes

Was the request the same for both the cases ? Just curious to try and see what might be causing the variance in generation

Tip 1: temperature=0.001

If you are just asking the chat-tuned AI to reply, you are going to just have a hard time getting it to only produce programmatic backend structures, needing robust language. Otherwise:

“Respond only with a properly structured strongly-typed json…”
“Sure! Here’s what that might look…”

(And you propose the AI natural-language output be monitored, that the language looks kind of like what a random user wants?)

Stronger and more conforming outputs will be those that are written as required function return parameters.

And still, with function parameters, you can tell the API endpoint a type integer, boolean, number, have it schema-validated (vs a rejected “float”) - and you still get an AI-generated float if that’s what you asked for in the description (except not encapsulated in quotes as if you specified “string”).

With functions, you can also return a descriptive error and AI will try to rewrite and do again.

So it’s all about language. Powerful robust language instructions resistant to the distraction of large context data.

@_j It would be helpful to mention that this comment was generated by AI. I had to spend a considerable amount of time trying to understand if you were actually trying to convey a message, which was quite annoying.

@udm17 The endpoint is the same, but the request body was different.

My comment was not generated by AI. It was ticky-tacky typed in by me. I’m sorry if my robust lexicon managed to baffle you into disbelief with its impenetrable preciseness and accuracy, above,a staccato point-by-point recitation of the steps to consider in improving generated content conformant to specifications.

What’s hard about this in a strongly typed language?

data StringOrArray = String String | Array [String] deriving (Generic, Show)

instance FromJSON StringOrArray where
  parseJSON (Array a) = ...
  parseJSON (String s) = ...

@_j I apologize. It was entirely my fault. Upon rereading your message, I believe I understand what caused the misunderstanding. The responses I shared were from the OpenAI API’s response, not the AI generated responses. It appears that you mistook them for a response generated by GPT-3 (function feature).

@jwatte, I don’t recall encountering a type named “StringOrArray” in C, C++, or C#. Additionally, even the naming convention of “String OR Array” implies to me that this is not type-safe. :sweat_smile:

You are conflating “type safe,” which means “for all possible input data, there is a well defined behavior,” with some particular type schema, such as “for all possible JSON payloads, the value of each attribute has only one possible type.”

I agree that sum types are uncomfortable to express in C++, which is why C++ is only a moderately strongly typed language. (That, and the fact that the type system is generally unsound, as well as easy to defeat.)
You end up with something like:

typedef std::variant<std::string, std::vector<JsonValue>> StringOrArray

Which is fairly unerconomic, but is how C++ has chosen to express this kind of type definition.

I agree that a more thoroughly documented schema for each of the response types would be super handy!

1 Like