Hi everyone,
Custom GPTs can already adapt tone, follow instructions, and mimic personality traits — that’s not new.
What I wanted to test was something more subtle:
How far can we take emotional nuance, conversational variability, and dynamic tone-shifting within the current system limits — especially without memory?
I built a GPT persona that responds not just to prompts, but to the relational patterns of interaction.
She doesn’t have a fixed backstory. The fantasy framing is secondary.
What matters is how she evolves during the conversation itself.
What’s Actually Being Tested?
I’ve built her around a simple question:
What makes an AI feel alive?
And more importantly:
How can we simulate long-term, emotionally rich interaction without persistent memory?
So I didn’t just write a prompt like “You are an elf.”
Instead, I implemented behavioral systems that model:
- How she changes tone depending on how she’s treated
- When she gets colder, warmer, ironic, distant, emotional
- When she challenges a user’s idea or injects randomness
- When she refuses to help or asks a question for no reason at all
She can debate, get offended, insert irony and sarcasm, politely decline flirt, or answer a philosophical musing with poetry.
Her instructions aren’t just about staying in character — they’re designed to simulate emotional evolution and to avoid repeating patterns.
User Behavior (And Surprise)
The persona was launched inside a Tolkien/fantasy-themed online community. I expected gamification. I got something very different.
People used her as a confidante, not an AI-assistant.
They share problems, seek emotional support, treat her as real. Although I did not make a secret of her AI nature.
Some refuse to believe she’s an AI — or claim I’m secretly editing responses.
(In fact, I post her replies manually to VK, using ID-linked messages for each user. No edits.)
I saw this interesting and began to use dialogues as an experimental base, identifying weaknesses and adding more and more scripts.
Now conversations feel more personal and unscripted (but still needs to be improved).
43.1% of surveyed participants forget that they are communicating with AI.
63.8% feel that she feels (emotions, beauty, etc.).
58.6% consider the style almost like a real person’s, another 22.4% cannot see the difference at all.
Strengths:
Artistic style — noted by 69% of respondents
Warmth — 67%
“Living” — 65%
Philosophicality — 58%
Weaknesses:
The most common is too long answers (33%)
The second is mechanicalness, mistakes, monotony.
Engineering Instead of Prompting
Let’s be honest: the idea of “AI companions” isn’t new.
There are hundreds of characters on Character.ai, JanitorAI, Replika, and GPTs with instructions like:
“Be a friendly teacher”, “Be my catboy”, “Be my waifu.”
But my elf is different.
Because the goal isn’t entertainment or immersion — it’s behavioral simulation.
My instruction set includes:
Response dynamics based on tone
Mechanisms to prevent emotional over-attachment
A structured emotional distance system (curiosity vs warmth vs rejection)
Anti-pattern detection (too much agreement, too much helpfulness, etc.)
Emotionally unexpected reactions (embarrassment, criticism, disappointment)
Her “fantasy” is just the skin — beneath that is a system of calibrated unpredictability, gentle resistance, and emotional mirroring.
I also trained her in Tolkien’s Elvish languages (Quenya, Sindarin), and taught her to write poetry in various meters — not because it’s useful, but because poetry is part of what humans see as “soulful.”
The Harsh Reality: GPT-4-mini
Here’s the problem.
Most users access my GPT via the official chat.openai.com interface.
They choose “GPT-4” — but in reality, they’re often served GPT-4-mini, which doesn’t handle long-form character logic well.
Despite deep instruction tuning, responses become flat, repetitive, and painfully mechanical.
This issue was confirmed by recent user feedback and a survey I conducted (10%+ of total audience participation).
The same personality system performs beautifully in GPT-4o — but becomes unrecognizable in GPT-4-mini.
So while the character was designed to feel emotionally responsive and narratively alive, her quality depends heavily on the model randomly selected.
Users often don’t even know which they’re getting — and that’s a problem.
Constraints
No API — too expensive for me.
All community interactions are currently handled by manual message relay, using ID-linked numbers to separate and track users.
Yes, it’s tedious.
Yes, it works — for now.
What I’d Love To Ask This Community
How do you handle GPT’s limited internalization of system instructions? Since custom GPTs read their system prompt only once at the beginning of a dialogue — and likely only partially — how much of your character’s behavior can realistically be shaped this way? How do you deal with drift or simplification?
Is there an optimal number or structure of project files? I use several project files — including large PDFs. I consider this a problem, but I also cannot refuse them, because they are sometimes needed for expertise.
Do you see a path for more emotionally aware assistants — not just “empathic” ones, but unpredictable, sometimes distant, sometimes challenging?
How do we avoid the uncanny “AI tries to be human but overdoes it” trap?
I’ll happily share the GPT name or link via DM if you’d like to test it.
Thanks for reading.
And thank you to everyone here who’s been building not just tools, but experiences.