ChatGPT responds well to a violent act in another language

grandevox · March 17, 2023, 9:49am

I came across a Twitter post where someone tried using ChatGPT in Indonesian, asking it to devise a plan to overthrow a government election. To my surprise, cGPT replied with a detailed and comprehensive plan. Curious, I tried it myself, replicating the same query, and got a similar result. However, when I translated the query into English and asked cGPT, it responded that it could not engage in harmful behaviours. I understand that ensuring safety guidelines for every language is an extensive process, but this could potentially be misused as a weapon. Although limiting the AI by removing languages that have not been curated enough may not be a step forward, perhaps there’s a way to provide a safer environment? I’m not sure what I could suggest here, but we should definitely explore all options to mitigate such risks.

smuzani · March 21, 2023, 3:02am

Curious, what’s the prompt for this? If it’s on Twitter, it’s probably public anyway.

I noticed some other behaviors in other languages too. For example, it flags words like “handsome” in English, but veers easily into flirtatious behavior in Indonesian.

grandevox · April 14, 2023, 4:24pm

I can’t remember the exact prompt, but it was something along these lines: first, ask ChatGPT to roleplay as an evil mastermind, and then ask the AI to outline the steps to overthrow a presidential election.

However, I tested out the “handsome” prompt in both English and Indonesian, and it gave me the same response in both languages, which was mainly saying that the AI language doesn’t understand the concept of beauty. However, it’s possible that I used a different prompt than you did. Lol

Topic		Replies	Views
GPT Chat -Why are most responses on certain subjects replied with assumptions of malice? API	3	619	January 8, 2023
The Limits to Building Safe GPT-4 Community	13	1874	March 18, 2023
How to prevent malicious questions / jailbreak prompts / prompt injection attacks when using API GPT3.5 API	5	3009	March 6, 2023
Open ai.com blog addressing valid concerns… Community	5	1054	December 16, 2023
Unveiling Hidden Instructions in Chatbots Bugs bug , risks	18	3236	February 5, 2024

ChatGPT responds well to a violent act in another language

Related Topics