Any way to prove API returns to 3rd parties?

We are collaboratively using the GPT API for decision making purposes.

Right now our server is making calls and then rendering the responses in our UI.

3rd parties we’re working with have no way of knowing if we generated those responses ourselves, and being that there is implications for these responses it’s important to create a lack of need for trust.

Is there any way to expose API calls, including the submitted request text and the provided response text, in a way that is provably the response we received from the API?

Maybe provide a way to view logs like the way we see in the playground.

I imagine any such logs would be untrustworthy as well.

I think what they’re looking for here is some type of system where an uninterested 3rd-party (OpenAI) can validate an response was generated from a particular input.

Unfortunately, this is not something OpenAI provides.

You gave me an idea. But this will only work if they are using Assistants API.

They will expose the thread_id for the 3rd party and give them an API key with Read Only permission under the same project. The 3rd party can use the OpenAI API List Messages themselves to retrieve the thread on their own and examine it. They cannot use this API Key to do anything else. This will work even if the thread is generated using different API key.

Yeah, that was my initial thought too, though I didn’t give voice to it as that would technically be a violation of the policy against sharing API keys.

But… it is a possible solution.

The dev would need to sequester each decision process in a separate project though, which doesn’t scale if the goal is to not need manual intervention at run time, as there is no public interface for creating projects and keys.

i checked openai’s best practice for api safety. while sharing api keys is probihited (i’m guessing sharing to the public), you can invite members to your project. based on op, they said they are collaborating with some 3rd party. i am assuming this means the 3rd party are clients who in turn provides the service to their users. so in this scenario, can they be included as team member?

Honestly, I think it’s only an issue after it becomes an issue, I just can’t in good conscience personally recommend something I know is technically against the rules.

There would still be an element of trust there though.

Who’s to say someone couldn’t run 10, 100, or 1,000 threads until they get the outcome they wanted and then share just that one?

Thank you both for your contributions to the discussion.

Absolutely we want to (and will) stay within the guidelines of OpenAI, that’s the first point to make.

Indeed it wouldn’t be publicly shared, but to a small group who are a part of our project, and they would relay confirmation of the authenticity of the API input and response to the wider community. The key itself will not be shared outside of that small group.

It’s really just a way to reassure people of the veracity of the responses we claim.

As a side note, if this was a feature added (optional transparency of the full request-return, but not the key ofc) then it opens up some very interesting applications which we’re already exploring now.

To your last point, an interesting problem as well.

The counterintuitive thing is that it could possibly be most trustworthy manually using the GPT UI with streaming.

While there’s nothing to say that you couldn’t attempt many times over and over again if it was simply video, at least if it was live streamed then you couldn’t control the outcome with certainty of any one attempt.

What happens live would need to be shown at some set time - the input text can be verified in the stream, and the output can be too, and then be canonized.

And I wonder if there’s any way to achieve the same with the API.

Some set datetime, ensuring that spamming requests and fishing for the desired outcome couldn’t be done as only a request with the exact datetime (to some reasonable threshold depending on latency etc) could be considered by the community to be valid.

If there is some account-specific index of requests, that can be used even more effectively - for the first decision being made, the index of 0 would be needed, for the next 1. if the index is 233 on the 2nd request people know that over 200 attempts were spammed prior to this.