wclayf
16
The problem with making two API calls every time instead of one is the latency, and degrading performance for no reason.
I can see use cases where developers might have a need to call the āmoderationā endpoint standalone, with no intent to RUN the query if it passes moderation, but just because thatās true doesnāt mean that 100% of all of the most common usage patterns of the API should always require two back to back HTTP requests.
The obvious solution is a āmoderate=trueā flag on the completions endpoint, that defaults to true.
wclayf
18
Yes indeed, the additional HTTP call adds latency.
I think one of the most common use cases is āchat botsā where apps take raw input from end users, and so 100% of those calls would need to be sent thru moderation (according to many peoples interpretations of the vague policy docs), and so if thatās the case the rational choice for API design is to have moderation built into the ācompletionsā call, as an option that defaults to true.
Iād like to say to the API, āblock this if it violates OpenAI policyā, and donāt ding my API key. So maybe the more applicable parameter name is āding=falseā. lol.
This seems like a lot of extra work and compromises just to prevent mere milliseconds of latency.
When you are building a tool for all programmers the less assumptions & opinions the better.
1 Like
N2U
20
I can agree with that, I think itās worth delving a bit into this.
This is absolutely true, letās remember that the moderation endpoint is an AI classifier. Itās probably not as resource-intensive as GPT, but it will still require GPUs, and those are currently in short supply.
1 Like
sps
21
Not just that. At the scale at which OpenAI receives requests per sec, even a slight hold can induce a huge backlog.
As the saying goes āthe more moving parts, the more things can go wrongā.
1 Like
N2U
22
Fair point,
Although I think itās worth mentioning that OpenAI is already passing the input, output, and generated title on ChatGPT through the moderation endpoint, so they do already know how to successfully put those cogs together.
1 Like
sps
23
ChatGPT is an end-user product and that uses the moderation endpoint really well.
also,
3 Likes
wclayf
24
Donāt you think cutting the number of requests per second in half is a good idea? lol. I do.
wclayf
25
When developing a tool for everyone, also the most commonly occurring usage patterns become particular relevant. If most of the time a call does moderation+completion, thatās a pretty good tip you need an endpoint that encapsulates exactly that.
N2U
26
I 100% agree with @sps
But I fail to see the logic here
Why would this cut the number of requests per second in half?
1 Like
wclayf
28
āAPI Endpoints with Integrated Content Moderationā take one HTTP request. Without integration itās two.
wclayf
29
You made a whole lot of assumptions I wasnāt implying. Iām just saying have a āmoderate=trueā flag, and if I set that the server can simply throw a bad request error, with the reason. Itās not mixing responsibilities. Itās good API design. If something takes 10 steps you donāt always tell the API consumers to make 10 HTTP requests. Itās an art not a science.
wclayf
31
HTTP APIs normally have all kinds of different āreasonsā that any particular request can fail. Iām just saying ābad moralityā is most certainly one of the reasons the ācompletionsā should be able to fail. And from my 23yrs exp as a dev, I can tell you itās about 3 lines of code to check this and throw an exception in OpenAIās implementation. Aside from adding ādontDingMeBroā as an argument.
N2U
32
The rate limits are counted per endpoint, using the moderation endpoint is not going to count towards your rate limits on completion requests 
Regardless of academically good API design, Azure already does this. They run filtering on all chat completion requests and return moderation errors from that endpoint.
wclayf
34
Good point. That would imply an āimmoral queryā attempt would cost nothing, because they refused to answer it. Thatās identical to submitting a moderation endpoint query that gets failed. Currently they already offer the pure moderation endpoint for free. Correct.
To me thatās just silly and would result in a lot of unhandled errors
Well I think thatās the crux of the issue in these threads. Azure thinks that itās important enough to strictly enforce all requests should be moderated before running. OpenAI doesnāt, and while OpenAI may punish you if you donāt moderate it, the documentation could certainly be more clear around when to use it and the ramifications of not using it.
1 Like
Will take it as an action item to make it more clear people should use moderation, our best practices and safety best practices already suggest this but will look for more places to add it.
8 Likes
grimes
38
yes, they should. My service is partly backed by GPT and the actions of users of my service could lead to my api access to be revoked. I am implementing a moderation verification before the api call now, but I hate that this is even a thing. The API could just respond with a code (HTML - like) that says 403 Forbidden when against the guidelines and the api user can implement a catch