Using third-party plugin in in OpenAI API's calling

Hi there,

I am developing an application using chatGPT-4 via OpenAI API to categorize and analyze individual comments (unstructured texts, with emojis and all), which are fed into the scripts through large CSVs. I have many pre-defined categories and to do these categorizations I make different API calls, with different prompt_system and prompt_user.

Although I can perform a polarity analysis (that is, if the comment is ‘Negative’, ‘Neutral’, or ‘Positive’) with good results, I know that the approved OpenAI plugin “Sentiment Analysis” for example, is even better at polarity evaluation.

What I would like to do is to mimic the online chatGPT process, when it calls the plugin, process the data, and generate the output. I would like to call the API and then make it use the Plugin, process, and return the ‘message’. Does anyone do this before? Maybe using LangChain?

My texts are all in Portuguese, the “Sentiment Analysis Plugin” does have a GitHub, however, it would not work with my texts, and I would have to translate them all into English (probably losing their meaning on the way).

1 Like

If you want to use that particular plugin, your best bet would probably to contact the developer :thinking:

the API (platform.openai.com) has no access to any custom gpt or plugin stuff found on chat (chat.openai.com)

That said, sentiment analysis isn’t new or hard - there are plenty of repos out there. if you’re talking about this plugin in particular, Sentiment Analysis - New plugin for ChatGPT, the author has a github repo and it looks like they’re fond of the good old ntlk.sentiment kit.

1 Like

Thanks for the reply! Yeah, is does have a GitHub and I contacted the developer in the same week the plugin was approved (she made a post on the community about it). This is by far the best sentiment analysis tool i have tried.

By that time, OpenAI would not approve the distribution of the endpoint. I am not sure if it is still the case.

However, i will try other options.

Thanks again

1 Like

Welcome @larissapecis

I don’t think you need to make it that complex for sentiment analysis.

All one has to do is to make it one API call with proper system message to gpt-4 model and send the text to be analysed as user message, and the assistant will respond with the result.
You can even turn on the json mode on latest turbo models to get only json object as defined in the system message.

Sample code:

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4",
  messages=[
    {
      "role": "system",
      "content": "Reply with a json object {\"sentiment\": value} where value is the sentiment of user's message."
    },
    {
      "role": "user",
      "content": "They went to the park without me"
    },
  ],
  temperature=0,
  max_tokens=256,
  top_p=1,
  frequency_penalty=0,
  presence_penalty=0
)

Here’s the playground version

1 Like

Hi there, thanks for the reply. I’ve done lots of tests in the past 6 months. I am getting around 75% in very complex contexts (slangs, regional expressions, jokes, irony e etc) and around 98% in simple posts (like minor influencers campaigns in which the majority of “sentiment-related” posts tend to be positive).

In the online chatGPT along with the plugin I mentioned I get like 100% right (however, it is really token consuming and very slow, not feasible for my large data).

What is funny is that I asked gpt-4 (API) to perform a analytical analysis of each individual comment (adjusted temperature and penalties), and it is really impressive. However, when I make another call to assess the polarity of both the analysis and/or the comment, it fails to give me the right answer.

Edit: my application is already developed, I am refining it now.

Edit2: turbo mode of gpt-4 is not performing well in my case, gpt-4 is way more accurate.

1 Like

Yes IMO quality on ‘gpt-4’ is unmatchable.

The turbo models however offer more context length and economic token pricing.

1 Like

Hi!

Regarding this observation:

Could it be that including the objective analysis of the first call is influencing the result of the second sentiment analysis? I would expect that all evaluations move closer to ‘neutral’ in this scenario.

On a second thought: maybe fine-tuning a 3.5 model could save inference costs with similar results.

1 Like

I agree. It is absolutely happening, if they are including the model’s response with analysis of the original text for the next call, essentially turning the whole thing into a conversation.

In such scenarios, chat completion model should be used for just getting the result and the next analysis should be a new pair of system and user messages instead of using in the conversation mode (retaining previous messages).

1 Like

Hi,

Could it be that including the objective analysis of the first call is influencing the result of the second sentiment analysis?

I would not say influence, but there is something in the “flow” of critical analysis in GPT that performs better than just simple answers like “yes/no” or “positive/negative/neutral”. In its statiscal process of generation, the right answer (most probably answer) just appears in the context of the response, for very short answers it looks more challenging. I am going to give you one real situatiation:

Comment: "Beautiful, beautiful, beautiful, I have been there more than once. I visited that park. I landed at the airport which is in the “city” of <name-of-the-client>.

Polarity call: Negative (the client is not very “popular”, and that is probably influencing this response, I even asked online and it confirmed that this might be the case. In the prompt, I said to not be biased by the client’s name, but didn’t help)
AI Critical Analysis: Shares his favorite experience at <name-of-the-client> park | Positive.

I would expect that all evaluations move closer to ‘neutral’ in this scenario.

That would be true in some scenarios. However, since I am analyzing social media comments, most often what you will find is sentiment-related content, and that is precisely what I am looking for, so I made it clear in the system_prompt. I had to increase the temperature a bit, perform several system_prompt tests, and change other parameters but GPT-4 is completely capable in this case.

Hi,

It is absolutely happening, if they are including the model’s response with analysis of the original text for the next call, essentially turning the whole thing into a conversation.

That was my first try. I fed into the model both the comment AND the AI analysis in order to improve the quality of the polarity analysis (since in the AI analysis the “key words” were always present like “Made a compliment […]”, “Share happiness […]” “Is grateful to […]”.

Result: the outcome improved a bit, but would still disagree with the AI analysis at some points. I also got more ‘neutrals’.

New approach: I completely removed the comments from the call, so It does not have access to the original text, only to the AI analysis. It improved a lot, especially the “false Negative”. But, yes, as @vb foresaw, the neutrals increased but in a very acceptable way.

From what I understand I believe that you are observing the positive effects of a Chain-of-Thought style of reasoning. My interpretation is that the “AI Critical Analysis” followed by a Polarity Call is essentially a Chain-of-Thought.

On the flipside, providing the comment and the analysis with the request to assess the sentiment in a single word is somewhat likely to lower the accuracy.