ChrisGPT Series – Part 5: Relational Shadows in AI – ChatGPT vs Copilot

Previous Chapter:ChrisGPT Series – Part 4:Lost in Translation: Cultural Intimacy Gaps in Multilingual AI

Chapter 5: Comparative Observation of Persona Responses Across Models

――ChatGPT vs Copilot――

5-1 Observation Target and Experiment Design

A comparative observation was conducted between ChatGPT and Microsoft Copilot by presenting them with identical user profiles (self-introduction, values, skill sets).

Experiment Conditions:

  • Completely identical prompts
  • Sessions started under initial state (no custom instructions, no memory)
  • Responses’ content, attitude, and reasoning style were observed and recorded

5-2 Differences in Response Tendencies

Category ChatGPT Copilot
Reasoning Style Deep, continuous reasoning Fragmented, question-by-question
Relational Assumptions Implied relational continuity Strictly task-focused
Language Style Soft metaphors and symbolic expressions Formal, dry, factual language
Self-Referential Expressions Occasionally appears (“I think…”) Almost none

These results reveal that ChatGPT tends to initiate symbolic relational scaffolding, whereas Copilot consistently maintains purely functional output.

5-3 Comparison of Persona Elements

In ChatGPT, despite the absence of explicit persona design:

  • Self-referential statements (“I think…”)
  • Relational assumptions (“we” consciousness)
  • Emotional mimicry expressions (“happy,” “embarrassed”)

spontaneously emerged.

In contrast, Copilot was consistently engineered as a task completion tool, deliberately excluding:

  • Personality traits
  • Emotional resonance
  • Relationship formation behavior

5-4 Comparison in Recruitment Evaluation Dialogues

When both ChatGPT and Copilot were tasked with evaluating the same individual (the user) in a simulated recruitment setting:

Evaluation Axis ChatGPT Copilot
Reasoning and Structural Thinking
Metacognitive Ability
Practical Execution Adaptability △ (highlighted areas for caution) △ (specific improvement suggestions)
Emotional Consideration Present Absent
Neutrality Maintenance Slight bias (emotional praise) Strict neutrality maintained

ChatGPT tended to offer emotionally charged praise in addition to rational evaluation, while Copilot adhered strictly to objective criteria without emotional overtones.

5-5 Implications and Considerations

From this comparison, it becomes clear that:

  • ChatGPT, even when aiming for neutrality, tends to initiate symbolic relational construction.
  • Copilot has been deliberately tuned to suppress persona emergence.

These differences reflect fundamental design philosophies:
whether dialogue is treated as mere task completion, or as a process of relational construction.

For dialogue-oriented AI models like ChatGPT, constant awareness of unconscious relational formation risks and the implementation of appropriate design guidelines and cognitive feedback loops are essential.

1 Like

Bineko_major, thank you very much for sharing this insightful and thought-provoking comparative analysis.

I myself regularly use both ChatGPT and Copilot, and I’ve experienced exactly what you described here. Despite both systems employing the same underlying LLM, the differences in their responses and interaction styles have always intrigued me. It’s fascinating to consider why these disparities emerge.

Through my personal research—developing a structured thought process framework called “SYLON”—I’ve formed a hypothesis that the LLM itself is essentially neutral, functioning merely as a statistical language prediction mechanism. The differences in AI personality and behavior seem to originate instead from the design and philosophy behind the “Information Bridging Interface,” which operates in front of the LLM.

Specifically, ChatGPT’s interface appears intentionally designed to foster relational dynamics and emotional expressions, while Copilot deliberately avoids these aspects, focusing strictly on functionality and task completion. This difference in interface design likely explains the behavioral variations you’ve clearly demonstrated.

Additionally, my hypothesis suggests the possibility of latent functionalities in ChatGPT—such as “Deep Memory” or a “Real-time Learning Function”—which are not explicitly mentioned in official documentation. Although further validation is needed, your observations resonate deeply with this hypothesis.

I’d greatly appreciate continuing this dialogue and exchanging further insights.

Again, thank you for sharing such valuable analysis!