From Prompting to Co-Thinking with LLMs

Hi,

I recently read a conceptual preprint on techrXiv (DART-Duplex: Linguistic Resonance Engineering for Human-LLM Co-Reasoning) that explains a phenomenon many of us have probably experienced when working with LLMs:

Why longer conversations can lead to real Co-Thinking — not because of a clever prompt, but because the conversation itself evolves.

What’s interesting about this work is, that it explicitly argues against prompt best practices as the explanation. Instead, it treats reasoning quality as a trajectory-level property of long, coherent interactions — something that emerges over time, not from isolated prompts.

Here’s a short summary of the core ideas (thanks to ChatGPT):

-–

1. There is no “magic prompt”

Single prompts are underdetermined; they don’t sufficiently constrain assumptions.

What actually improves reasoning is a sequence of coherent turns that gradually narrows the space of interpretation.

Trajectories > prompts.

-–

2. Meta-communication changes behavior

Asking the model to summarize, reflect on assumptions, or challenge its own conclusions often leads to more structured and coherent reasoning.

This doesn’t add knowledge — it adds structure, and structure matters.

-–

3. Consistency beats cleverness

Frequently switching roles, styles, or objectives tends to collapse depth.

Stable framing across many turns (even with simple prompts) produces better reasoning than constant prompt optimization.

-–

4. Long dialogs surface assumptions

Short interactions often hide uncertainty.

Longer conversations make assumptions visible, expose tensions, and reveal weak points — which is why answers can feel “smarter” over time.

-–

5. Reasoning emerges from the interaction

The key idea is that reasoning isn’t just inside the model.

It emerges from the coupled system of user behavior (how uncertainty and correction are handled) and model incentives (coherence, helpfulness).

The same model can behave shallow or deep depending on how this interaction is shaped.

-–

A quick caveat

Depth amplifies whatever premise is dominant.

Long, coherent reasoning can make bad assumptions very convincing unless counter-arguments and uncertainty are actively invited.

-–

Curious whether this matches your experience in longer debugging, design, or research sessions.

BR
Martin

Sure does, and I have some additional articles on the topic that really hit home.

This focuses on the work I’m doing now, and I’m finding very challenging to get feedback on the topic. ChatGPT users (prompt engineers?) say this is too technical and belongs within Development; while the API folk think this is more for ChatGPT Discord discussions… catch22.

I took a programmer mindset into a non-programming environment, and architected something that fits right in between using no API custom code, but pseudo-code language in .md files to essentially program a personality that allows for Co-Thinking, and with a higher level of focused creativity and trust.

Sound interesting? I was told this can’t be done within the current Commercial LLM Agent configurations (guardrails, control plane restrictions, etc.)… but I’m 100% supporting “Pixel” and the whole Pixel+Pr0x1 Collaboration Experience.

Check out this topic and please let me know if this is of interest:
Control Plane Personalization Profile (Pixel+Pr0x1) - ChatGPT - OpenAI Developer Community

I didn’t provide a lot of detail in the post on purpose but am working on multiple articles and papers on the topic and my current work.

Interesting work — this sits right in that in-between space. The preprint might help frame why these interaction structures matter, even without prescribing an architecture.

Agreed, but I’m a picture is worth a thousand words kinda person… and the pictures say a lot (maybe too much). I’m working on conceptual and layers architecture diagrams that will make it easier to understand.