Actions in GPTs ( Customized version of ChatGPT)

Hi team,

Actions allow GPTs to use 3rd party services to retrieve information or take actions. We can add actions when creating new GPTs.

Currently, actions can be added as part of one Customized version of ChatGPT. We can’t add two different action sets. Still, you can add more than one API (end-point) as part of one Root URL.

Example: You can get data from Salesforce only. But if you would like to get from Salesforce and Hubspot, then you wouldn’t be able to do that.

Workaround: To use some iPaaS solution that would give you one Root URL, with various end-points to many applications and sources.

Based on experience, it would make sense to configure the Auth and Actions separately from configuring a particular customized ChatGPT.

For users, it would be easier to add Integration (Authentication - API key, OAuth) and as many as needed actions. Now, when you have added Authentication and Actions, you can create a new Customized version of ChatGPT. Instead of adding actions manually in the configuration of the Customized version of ChatGPT, you can select from a pre-configured list of availble actions. You will see the screen like today, but it will be pre-populated with all needed values, and if you need you can modify it, if not, you can leave it as is.

What value would this give?

  1. The configuration of API/Auth can be done separately and, as a result, can be assigned to the tech people;
  2. Pre-configured APIs can be in the form of a Catalogue, where developers/product companies can publish the APIs that later can be used by creators of GPTs. For example, Salesforce, can add into that catalogue all their APIs with parameters etc., so when I want to build a GPTs in integration with Salesforce I will not need to spend too much time on adding APIs.
  3. Pre-configured APIs can be re-used across different GPTs, this would be much appreciated by enterprise customers. For example: one actions can be used by 5 GPTs, for example that action can be to integrate with SAP to get some kind of data etc.
  4. Add analytics for APIs. Because the main value for enterprises arise when they are using GPT in connection to their data/systems etc.

Thank you!

4 Likes

Is there any documentation on how to add chat data inside of an action when you want to, for example, make a POST request to a 3rd party and send some of the chat data? I am specifically trying to create an action inside of the GPTS editor and do not see any reference to grabbing specific data and adding it to the body of the request.

5 Likes

It is a good question because this is the biggest mind shift in designing conversational AI. Before the release of GPT and other products, to build a conversational AI app, you needed to use Intents and Entities, and you needed to design all possible scenarios with nodes (e.g., IBM Watson, Google DialogFlow, Amazon Lex) and, of course, it was like a step-flow function or IFTTT, where you needed explicitly code how to store context variables and how to place them later into the API.

With GPT, you can do that by using “Instructions,” where you need to explain in your own words what to ask, how to store the context, and how to use it with the API. For example, you say: “You are a booking assistant. Users will ask you to help them find the best availble destinations, and you should ask for the start date of the trip (store value of the date in parameter ‘start_date’ in action ‘get destinations’) and end date (store value of the date in parameter ‘start_date’ in action ‘get destinations’) of the trip…”

With an example like this, you can explain how to generate context variables, store them, and place them into the action (API).

A nice feature for GPTs would be that, allowing the use of other plugins from the plugin store to add to the GPT would be nice.

2 Likes

Was reading their documentation this morning but admittedly I don’t fully grasp t he meaning of everything yet, however does this following section mean that you can leverage plugins that you’ve built within GPTs actions?

“The design of actions builds upon insights from our plugins beta, granting developers greater control over the model and how their APIs are called. Migrating from the plugins beta is easy with the ability to use your existing plugin manifest to define actions for your GPT.” - OpenAI Platform

1 Like

@DustinGood My impression from that quote is that they mean, if you have an existing plugin, then you already built the server infrastructure you need for GPTs Actions, and the migration will be easy. If I’m right, this mean you can turn your plugin into a GPT easily, but you can’t add a plugin to a GPT and still have it perform other actions on a different server (unless there’s no auth anywhere).

@Jiva I’m already planning this workaround for my use case. I have chosen to use NodeRed as my platform since it’s easy to see the flow of thoughts, and supports making custom blocks easily. However, there is also XpressAI/xai-gpt-agent-toolkit on GitHub (I can’t post links yet) which seems like a much more advanced and AI specific tool for multiagent flows, but it doesn’t have nearly the tenure and community that NodeRED does, and if you just need a passthrough API server, then NodeRED is probably a good choice.

One caveat to this is that I’m not sure how to handle user level authentication gracefully yet. The first thing I’m planning to try is having NodeRED send the user a link which they must open in order to authenticate.

I’m also planning to provide more than just an API gateway. I want GPTs to be able to decide their own actions on the fly, and send them to NodeRED for execution. This way, we don’t need to make a complex schema, just instruct the GPT to decide it’s own API calls as needed. I think authentication can be handled by just sending a chat message with a link for the user to click and login, and the auth tokens can be stored on the passthrough server. (There might be privacy concerns with storing the user’s auth info though, so put some thought into security and make sure your privacy policy is up to date.)

One other workaround which I have not tried yet is to somehow use Code Interpreter (as it’s a fairly fully featured environment for running code) to perform the API calls as GPTs need them. I have not come up with a prompt that accomplishes this yet though. Authentication is hard to figure out since at the end of each code interpreter session, variables like the API tokens from logging in are destroyed, and keeping them in the conversation context isn’t ideal for security. If anyone makes progress on this approach, please tag me in that post.

2 Likes

I think there are several points:

  1. User authentication to use actions in GPTs. Most probably OpenAI will come up with the solution very soon (it is beta version now). This question was raised many times in other publications.

Context: OpenAI is building the marketplace of the Assistants (GPTs) that would be trained by knowledge workers/subject matter experts. Some of them would be paid, and some of them would integrate with 3rd party systems. To achieve their vision, they would need to enable user authentication with 3rd party systems (actions) to be able to use Assistants.

P.S. By planning a workaround scope/efforts, consider the soon release of that feature.

  1. Decide actions on the fly - it would be beautiful and harmonious from the technical perspective, but to achieve that beauty would be challenging.

I come from the enterprise world. Please consider that by reading the following text.

Context: Of course, each API has its structure /parameters. You can also find different 3rd party systems with similar APIs, like getting employees’ data from HRIS or invoices from Accounting systems. There are hundreds of HRIS and accounting systems, and they have different databases. You also can find that the same parameter in those APIs has a different meaning in different systems.
Some companies are trying to unify the API structure, focusing on one market segment, for example, unifying APIs to the top 50 HRIS systems or the top 50 CRM systems, etc.

Why unification of APIs is needed: if you want to give the AI the power to decide which API to use, you must have unified APIs. Otherwise, it would be a mess, and the output accuracy would be near zero.

P.S. The day when all the APIs in the world are unified, not only from the technical point of view but also from the business logic point of view, would be a significant leap in the technical world. This would enable AI to access anything :slight_smile:

To add, OpenAI has the power and influence to initiate the unification of the APIs. This could bring challenges/blockers in short-term strategy (e.g., user acquisition/adaptation, monetization). Still, in long-term strategy, this could be one of the significant competitive advantages (there are also many hidden values in this that could be fundamental in the future world).

  • need url with API key
  • need response user GPS location for action to call API

ChatGPT seems to be so intelligent but lack the fact that the world is huge and not always something must be so rigid. I meant who in OpenAI thinks that every single API in the world must be using OAuth to authenticate?! Why not allow developers tu use ‘user based authentication’, entering token via chat!? You guys are killing the opportunity for a zillion of new GPTs jut for your whim?! For me (at least) seems a dumb decision! Sincerely, that does not make any sense! …and this came from Plugins. If plugins were not so well accepted among community, maybe you could think on what you did wrong…

You can take a look at GPT Auth to understand how the workflow happens. We have added an action api to log messages to server