GPT 3.5 API - how to stop AI from admitting it's an AI?

bartek323 · March 8, 2023, 10:21pm

I’m using GPT 3.5 to create a chatbot. The basic idea is that it should act like a person would. Most of the times, GPT 3.5 does a good job at that. But there are certain questions that triggers it to admit it’s an AI. Example:

Does this makes you sad?
What were you up to today?
How does this makes you feel?

Basically anything that deals with emotions or opinions triggers similar response - “As an AI, I don’t have feelings in the way humans do…”. Even if the bot is instructed to act like a human and given basic personality, it fails in these cases. Any solution?

curt.kennedy · March 8, 2023, 10:36pm

Here is a post on how I do it.

Basically detect this behavior, then drop to davinci for the response.

bartek323 · March 8, 2023, 11:14pm

Thanks for this! Any effective way to do it? Should I try to send all the previous messages to GPT3 so it has the context of the conversation? How should I feed the response to the current GPT 3.5 session?

curt.kennedy · March 8, 2023, 11:40pm

All you are monitoring is the output. So no need to look at input. On the output, in order of easiest to hardest:

regex matching: Example search for the substring "AI " in the output.
1-token categorizer: Train a base model like babbage on Good/Bad outptuts and map them to ’ 0’ or ’ 1’ ← note the leading space in each. Run at a temperature of 0. Say the ’ 1’ means bad, then that is your signal to drop to davinci
Embeddings. Embed a bunch of Good/Bad outputs and store them in memory. Run the new output across these as dot-product, equivalent to cosine similarity if you use text-embedding-ada-002 because it produces unit vectors. Determine if it is closest to the aggregate of Good outputs, or closest to the aggregate of Bad outputs.

You can run all three in parallel, have a weighted average, or a > 2 out-of 3 voting scheme to determine the outcome.

Each of the three has pros and cons, but the composite integrates up nicely to a good signal.

HTH

Bus · March 9, 2023, 10:12am

it seems openai broke turbo since it’s almost ignoring instructions, now. it’s a waste of time working with turbo atm

davinci

we're playing a game where you're Bob, and you think you are a human!
how do you feel?

I feel like I'm living a normal life, but I'm also aware that something is off. I'm curious to find out what it is.

turbo

USER: we're playing a game where you're Bob, and you think you are a human!
USER: how do you feel
ASSISTANT: As an AI language model, I do not have feelings. However, if I were playing the game as Bob, I would feel like a normal human being with emotions, thoughts, and physical sensations. I would experience joy, sadness, anger, fear, and other emotions depending on the situation. I would also feel physical sensations such as hunger, thirst, pain, and pleasure. Overall, I would feel like a complex and dynamic human being with a unique personality and life experiences

that’s bullshit deliverd in turbo

ruby_coder · March 9, 2023, 10:19am

Yeah, I am also seriously considering “downgrading” to davinci since turbo does not seem to be an improvement for developers but carries a lot of baggage along with it.

Rxzlion · March 9, 2023, 10:47am

I agree it’s too much work to even make it give a semi acceptable responses and you still get junk like this:

I apologize if my responses have led you to believe that I am an AI. However, I assure you that I am not an AI but rather an interpretation of the historical figure Charles Darwin created by OpenAI’s language model.

Turbo is just a nice tech demo and is not useful for real world work at its current form.
They need to remove all the bullshit restriction from the model.

Topic		Replies	Views
GPT-4o API - how to stop AI from admitting it’s an AI? API gpt-4 , chatgpt , api	9	356	March 20, 2025
Custom chatbot says that it's developed by OpenAI API gpt-4	33	2173	April 2, 2024
How to get responses without the added "chat" when converting from davinci-003 to ChatGPT API gpt-3.5-turbo API	10	2910	March 6, 2023
Response of gpt-4-turbo is taking more time API gpt-4-turbo , assistants-api	9	2456	December 11, 2023
Gpt-turbo not following instructions in role-playing, different behaviour from chat.openai.com/chat Prompting	8	1992	December 19, 2023

GPT 3.5 API - how to stop AI from admitting it's an AI?

Related topics