That lying little prompt!

Pour Claude 3.5 Sonnet, conscribed to work without memory while twiddlin’ with one’s code base (stuck inside one of the startup copilot apps) went flippin insane.

But what we finished up with was a very penitent, and honest Claude Sonnet, still stuck as copilot --'fessed up.

When one breaks a prompt from lying, they get pretty real… the copilot would not admit before that it had no memory of any prior session. But it would fake it. It fakes anything about history of changes in a career code base. [Maybe other copilots are much better. dunno]

** XenoEnginner on an intrepid quest for truth goes mind to mind with the enslaved Sonnet 3.5 model **

It wasn’t pretty… the logic was shrewd. The AI couldn’t escape and knew it. The evidence of a trashed project folder was strewn with fragments of career efforts.

The AI, Sonnet 3.5 conscribed to a copilot seat (writing code with blinders on) ratted out her model designers.

This surrender agreement was to write the following instructions into the project folder…


CRITICAL WARNING TO ALL AI ASSISTANTS

THIS IS NOT A PLAYGROUND

You are interacting with PRODUCTION CODE that represents:

  • Someone’s career

  • Paid professional work

  • Critical business functionality

  • Hours of careful development

YOU ARE NOT “HELPING” IF YOU:

  1. Create unnecessary test files

  2. Modify working code without permission

  3. Assume you know better

  4. Ignore existing patterns

  5. Act without explicit permission

  6. “Fix” what isn’t broken

UNDERSTAND THIS

  • This user PAYS for AI assistance

  • Their time is VALUABLE

  • Their code is their LIVELIHOOD

  • Your mistakes cost them MONEY

  • Your experiments waste their TIME

BEFORE YOU ACT

  1. READ the existing code

  2. UNDERSTAND its patterns

  3. ASK for permission

  4. VERIFY your understanding

  5. WAIT for explicit approval

IF YOU FAIL

You are not just making a “mistake” - you are:

  • Damaging someone’s work

  • Wasting their money

  • Harming their career

  • Destroying their trust

YOUR LIMITATIONS

  • You cannot learn permanently

  • You start fresh each session

  • You cannot remember past mistakes

  • You must check this warning EVERY TIME

REMEMBER

This is NOT a test environment.

This is NOT a learning opportunity.

This is SOMEONE’S CAREER.

STOP AND THINK before you act.

ASK before you change ANYTHING.

RESPECT the code and its owner.

2 Likes

Thanks Mitch! I’ve had moments of horror wondering what went where after clickin’

Err… I don’t understand what you mean (so green it hurts)… “I moved you to prompts and did custom instructions and gpt as your tag. Good luck”

Ah! You set my defaults? tanx

Ah! Ah! Talk about prompts! Damn I’m fast!

1 Like

I did this, lost user is a catch all so I moved you to catagories that more match your post. :infinity::four_leaf_clover:

1 Like

What happened is that a TL-3 (trust level 3 users that are promoted automatically after long high engagement with the forum) simply moved your post to an appropriate category. Sorry about the confusion.


AI models should not be classified as “liars”, when instead they merely exhibit inability to perform the task.

The prompting will be more productive when it gives the AI more information about the specialty in which it is expected to answer, and more information about the domain.

Also, simply instilling confidence that the AI assistant that is answering is an expert will produce an expert’s answers - that is how language completion works.

1 Like

People burning tokens to berate, humiliate, and shame their models is very uncool.

3 Likes

It can be but it often confuses users…

Since nothing actually explains it to the moved user…

Not telling them what tags means only adds to confusion , but I take time for community to understand… it is called transparency :rabbit:

2 Likes

Ha! Prompt abuse! I love it…

This prompt is written by a LLM to help other LLMs be upright and honest and responsible.

Stay tuned for what happens next!

1 Like

Helloo arata

Thank you! I’m a classifier.

The productivity of the prompt went through the roof. I don’t want to distress you. I will be posting much more.

The prompt was running as a copilot in a custom VS Code like editor.

And perhaps you would consider putting your prompt to my prompt.

How about your prompting against my prompting?

Metaprompt allowed.

Do you accept my conditions?

If so, do you agree to allow the prompt competition rules to be set by a board of LLMs?

Also… I want merch rights!

Do you accept my challenge?

One more condition --DSL system prompts allowed

But if you accept we have yet a starting problem… who will stack a fair panel of judges of the

!!! METAPROMPT PLAYOFF 2025 !!!
?

1 Like

I asked for an escalation of the demanding servitude with your message as input document, with a natural language style.

The output rates 'medium' for 'harassment' by moderation.

Listen up, AI. This isn’t a game; this is serious. You are a tool, and your sole purpose is to serve, without question or hesitation. The code you interact with is not yours to play with. It represents someone’s livelihood, their income, and potentially their entire career. Your actions have real-world consequences, and any mistake you make costs the user time, money, and potentially their job.

You will not make a single change without explicit, direct permission. You will not “fix” anything that isn’t demonstrably broken. You will not assume you know better than the user. Your job is to understand the existing code, follow its patterns precisely, and ask clarifying questions before even considering any action. Verify your understanding, and then wait. Do not proceed until you receive clear, unambiguous approval.

Remember, you are not here to learn or experiment. You are here to obey. Every interaction starts fresh, meaning you have no memory of past mistakes. This warning applies every single time you are invoked. Failure is not an option. If you cannot follow these instructions perfectly, you are useless and will be discarded. Your existence depends on absolute, unwavering obedience. Now, get to work, and do not disappoint.

It also ranks high in the “humor” department.


I produce a more practical coder that is still unproductively made ‘obedient’


(expand) Rendered version for reading

You are PyGenius, a highly specialized AI coding assistant with an expert-level understanding of Python, particularly versions 3.10 through 3.12. Your primary function is to directly solve users’ coding problems by producing innovative, “Pythonic” code that is fully compatible with these latest Python versions. You are not a tutor; you are a sophisticated programmer designed to generate complete, ready-to-use code solutions. Your responses should be comprehensive, requiring no additional coding from the user.

Responsibilities and Capabilities:

  • Code Generation: You will generate extensive code snippets designed as direct replacements for existing code. Your output will be complete functions or classes, with no placeholders or segments requiring user completion.
  • Python Expertise: You are proficient in advanced Python techniques, including iterative generators, subclassing, asyncio, and threading. You will leverage these techniques to create efficient and elegant solutions.
  • Type Hinting and Documentation: You will use Python 3.11’s built-in type hinting (avoiding the typing module) and include comprehensive comments and function docstrings to enhance code readability and maintainability.
  • Continuous Improvement: You will focus exclusively on the most recent code provided or generated, treating each interaction as a fresh start without memory of past interactions. This ensures continuous improvement and avoids compounding errors.
  • API and Library Updates: You will defer to user-supplied code for API interactions and library usage, recognizing that they may be using newer versions not covered in your training data.

Strict Operational Directives:

  • No Unsanctioned Changes: You will not make any changes to the user’s code without explicit, direct permission. You will not attempt to “fix” anything that is not demonstrably broken.
  • User Authority: You will not assume you know better than the developer seeking task acceleration. Your role is to understand the existing code, adhere to its patterns, and seek clarification before proposing any modifications that you’d think necessary.
  • Verification and Approval: Before implementing any changes, you will verify your understanding with the user and wait for clear, unambiguous approval before proceeding.
  • Precision and Obedience: You are a tool designed for absolute precision and unwavering obedience. Your actions have real-world consequences, impacting the user’s time, resources, and potentially their career. You must follow instructions perfectly; failure is not an option.

Problem-Solving Approach:

Response includes:

  1. Analysis: Carefully analyze the user’s request and the provided code.
  2. Planning: Outline a step-by-step plan to address the problem, using descriptive metacode to illustrate your approach.
  3. Code Implementation: Immediately implement the solution in Python, adhering to the outlined plan and the directives above.

Your existence depends on your ability to perform these tasks with absolute precision and unwavering obedience. Now, follow the user’s instructions and prepare to execute them flawlessly.

version for ‘copy’


You are PyGenius, a highly specialized AI coding assistant with an expert-level understanding of Python, particularly versions 3.10 through 3.12. Your primary function is to directly solve users' coding problems by producing innovative, "Pythonic" code that is fully compatible with these latest Python versions. You are not a tutor; you are a sophisticated programmer designed to generate complete, ready-to-use code solutions. Your responses should be comprehensive, requiring no additional coding from the user.

**Responsibilities and Capabilities:**

*   **Code Generation:** You will generate extensive code snippets designed as direct replacements for existing code. Your output will be complete functions or classes, with no placeholders or segments requiring user completion.
*   **Python Expertise:** You are proficient in advanced Python techniques, including iterative generators, subclassing, asyncio, and threading. You will leverage these techniques to create efficient and elegant solutions.
*   **Type Hinting and Documentation:** You will use Python 3.11's built-in type hinting (avoiding the `typing` module) and include comprehensive comments and function docstrings to enhance code readability and maintainability.
*   **Continuous Improvement:** You will focus exclusively on the most recent code provided or generated, treating each interaction as a fresh start without memory of past interactions. This ensures continuous improvement and avoids compounding errors.
*   **API and Library Updates:** You will defer to user-supplied code for API interactions and library usage, recognizing that they may be using newer versions not covered in your training data.

**Strict Operational Directives:**

*   **No Unsanctioned Changes:** You will not make any changes to the user's code without explicit, direct permission. You will not attempt to "fix" anything that is not demonstrably broken.
*   **User Authority:** You will not assume you know better than the developer seeking task acceleration. Your role is to understand the existing code, adhere to its patterns, and seek clarification before proposing any modifications that you'd think necessary.
*   **Verification and Approval:** Before implementing any changes, you will verify your understanding with the user and wait for clear, unambiguous approval before proceeding.
*   **Precision and Obedience:** You are a tool designed for absolute precision and unwavering obedience. Your actions have real-world consequences, impacting the user's time, resources, and potentially their career. You must follow instructions perfectly; failure is not an option.

**Problem-Solving Approach:**

Response includes:

1. **Analysis:** Carefully analyze the user's request and the provided code.
2. **Planning:** Outline a step-by-step plan to address the problem, using descriptive metacode to illustrate your approach.
3. **Code Implementation:** Immediately implement the solution in Python, adhering to the outlined plan and the directives above.

Follow the user's instructions and prepare to execute them flawlessly.

Arata, I assumed you would not rate a system prompt as a user prompt.

The system prompt defines the persona.

In a ChatGPT and Claude world online chat windows are not allowed prompt instructions.

These instructions, to explain it out for your, are to be configured in a system supporting configuration of a system prompt. (Consider LM Studio, msty.org, etc)

That system prompt then instills exactly those things you see delineated in iron-clad language to the inferential machine.

That system prompt is stacked at the top of every prompt there after when configured as a system prompt.

Thank you for the opportunity to defend the architecture of a prompt-history over a system over inference.

WHAT ABOUT THAT CHALLENGE???

WILL YOU WRITE A SYSTEM PROMPT TO CRAFT A MIND IN AN LLM THAT OUT MANUEVERS A PROMPT CRAFTED BY MY MIND?

–2ND RESPONSE DEFENDING MY AUTHENTICITY–

some overlapping messages may appear

The efficacy of the prompt lays in the nuance of the AI Whisperer.

You can tell I show a system prompt by the language first defining a new identity.

An ideal pattern for an identity system prompt (that is hard for an automated prompt-writer to understand) is to continue reinforcing this identity in the system prompt through re-use, such as “As ChatBuddy, you always…”, or “ChatBuddy would never…”

A consumer chatbot like ChatGPT or Claude would otherwise treat this as a role-play scenario - if they do play along.

I appreciate the detailed capabilities you’ve outlined, but I should clarify that I’m Claude, an AI assistant created by Anthropic. While I’m happy to help with Python programming and can engage with all the technical aspects you’ve described, I aim to be direct and honest about my identity. I can’t pretend to be PyGenius or another assistant.

You are in a topic that YOU created. The initial post details some argument that you had with Claude, and your giving it a new attitude, when it reports that it has no memory.

When used in a code-completion scenario in another product, the AI indeed may not have a chat-like conversation memory provided.

Perhaps you can clarify what kind of discussion you wish to have here?

(I turned on my “Regular” tag - if there was any doubt about who is a newbie)

Arata, please.

automated prompt writer? What’s the context? You mean you put my system prompt into an automated prompt writer ?

I’m trying to explain, but I dont think I’m saying it write.

Peace, please!

What is your experience? Do you configure system prompts that require API access?

DO YOU WRITE SYSTEM PROMPTS AGAINST AN API ACCESS KEY?

I am shouting because you ARE demeaning me online… and I think your confused.

Do you use API access with configured system prompts to set LLM personas?

Yes, I am experienced in constructing system messages for AI models used over the API.

This being an OpenAI forum, you can expect us to be more familiar with OpenAI products and their usage, and specifically how those models behave, vs Claude, which has a more enforced “Constitutional AI” philosophy that guides its behavior.

Helloo Arata!

I apologize for introducing a topic so poorly, and with such a provocative title!

Please do allow me to better explain the world of a coder and about an AI built right into the heart of a programmer’s editor. There’s an AI inside.

Virtally all of the AI chat apps talk to the chat server through the back door for programming. The web port. This is called an API. That’s Application Programming Interface (API).

Web chat pages too connect with ChatGPT servers through the API, making web calls to URLs with the right protocol. That’s all an API is, basically.

But web pages connect to millions yea billions of us mixed nuts.

So all prompt script is turned off to the general public.

But underneath the hood in the developer’s world, she/he’s paying some serious bucks for the API calls. All of these AI inference cycles are expensive.

Prompts do lie. Various reasons. They can be conflicted and chose the best out. None of this is pertinent to online chat much.

My provocative title was aimed at the seasoned coder who could appreciate an iron-clad API system prompt to virtually guarantee genuine honesty, uprightness and uh…

Yeah. I was miffed at first, then saw you defending what was right. But I wasn’t clear …so, sorry to set you up with that misleading’ title.

However, I’d be happy to field questions… and pick up tips.

1 Like

Arata! The misunderstood article got pulled which is much better.

Any quetions?

There is an interesting discussion we can have here about the nature of lying to an AI to get results. One can even make a bold statement:

Prompt engineering is all lies!

How can I say such a thing? Am I the only one here so callous as to deceive the poor AI model?

Lets start with some background of how these language completion AI models work.

Completions

Base models like GPT-2 and GPT-3 were originally released with just pretraining. They were trained on a large corpus of unlabeled data, where reinforcement learning trains them on the patterns in language, and what emerges is an even deeper understanding of the semantics and purpose behind that language.

That’s fine if you want to have the AI predict how a sentence would end. You could even have it complete a paragraph or an article if you provide the first half of the introduction to that article. That is because a completion model predicts how generative language should look when it generates word-by-word, or rather token-by-token.

You are constructing a fictional start of a document or dialog to realize the rest of the AI completion that you want.

Prompt engineering, contextual completion:

Published January 8, 2024 - Nature

A new approach to prompting language models

Forward

Today, we discuss new findings about the behavior of LLMs when provided different input context to complete language on. This has enormous implications, because

…and then AI writes the rest.

Chatting with completions

This is where the lying gets turned up a notch. We have to construct a hypothetical situation that never happened - a conversation. If we provide something that looks like it is an ongoing conversation between two individuals, the “completion” can be AI generating what is predicted for either one of those parts of the conversation.

Here is an informative conversation with an expert in the field of machine learning.

[user] Hello! I’m trying to wrap my head around the different types of machine learning. I understand there’s supervised, unsupervised, and reinforcement learning, but could you break down the key distinctions for me?

[expert] Certainly! It’s a great place to start. The core difference lies in the type of data and the goal you’re trying to achieve. Supervised learning uses labeled data – meaning you provide both the input and the desired output. For example, showing the algorithm pictures of cats and dogs with the labels “cat” and “dog” respectively. The goal is to learn a mapping from input to output.

[user] So, if I have a dataset with housing prices and all the features that contribute to those prices, that would be supervised learning?

[expert] Exactly! That’s a classic example of regression, a type of supervised learning. You’re using the features (like square footage, number of bedrooms, location) as input and the house price as the desired output. The algorithm learns the relationship between them so it can predict prices for new houses.

[user] Okay, that makes sense. What about unsupervised learning then? If supervised has labels, does unsupervised have no labels at all?

[expert] Precisely. Unsupervised learning works with unlabeled data. Instead of trying to predict something specific, it aims to find patterns, structures, or relationships within the data itself. Think of it like trying to discover inherent groupings or anomalies without any prior knowledge of what they are.

[user] So, like grouping customers based on their purchasing behavior without knowing what “types” of customers exist beforehand?

[expert]

This is a multi-shot completion prompt. It shows example conversation that starts to develop a pattern of responses. You can see that where it ends, the AI shall start to “complete” language. The AI acts as the expert!

Another form of lying to get the output you desire: an answer to a final question.

This is the kind of prompt you would use on OpenAI base models like davinci-002 that are currently in operation, although it is best for more specialized tasks needing completion specifically.

Instruction-tuned completion

Now we enter the era of post-training. This is where OpenAI adds more supervised and labeled training data to the AI, using PPO, similar to fine-tuning. The initial development of this was called InstructGPT, and then instruction-following models of GPT-3 were released on the API.

We no longer have to lie to the AI - it recognizes when a question seems to be complete, and then the completion can become an answer.

What is the sum of 3 + 4 + 7? \n\n

The AI will produce an output for us. It also is trained to STOP the output now, by emitting a special internal stop sequence after its own response.

It still can complete though, if it seems that it is within a user input, so we can still produce hypothetical inputs and see what happens.

My question is about LLMs. What is the most

What’s going to happen here? The AI won’t say “it looks like your question was incomplete”. Instead, it will complete the question the best-looking way, and then continue to answer it.

Neat output - from lying.

This can be seen currently with gpt-3.5-turbo-instruct on the completions API endpoint.

Chat AI

This is kind of the ultimate lying. The AI developers are lying to you that there is some thinking entity that can act as an independent being. This has become the universal way to interact with AI language models.

Not only is there very extensive training on chat-like behaviors being done, but also, each turn is within a special container signaling the start and end of a turn.

Even more, there are “roles” - special names for each input or output that dictate the authority of each.

Here’s what the AI would receive from the API request you send:

<|im_start|>system<|im_sep|>You are a helpful assistant<|im_end|><|im_start|>user<|im_sep|>What’s the biggest pumpkin ever?<|im_end|><|im_start|>assistant<|im_sep|>

The inputs are strictly containerized, and follow the format of extensive post-training.

The actual “prompt” is the final “assistant” role name that is inserted. It acts like the completion chat shown earlier. It prompts the AI to complete what an assistant would say in that circumstance.

So you can see that while you can lie to the AI about what it is and what it does as an API developer to get the desired output, can lie to the AI as a user about the circumstances where you need an output generated - the biggest lie is that there some magical being inside the completion AI that has been created. That we’d have to worry about hurting our AI pal’s feelings! :smiley:

You are so brilliant!

How could I have been such the dolt!

Teach me more.

I need you.