Some simple (and obvious) best practices for the Prompt

I have in Playground a few Presets that I often use to inquire about LLM preferences.

The answers are very educative. Here is my humble summary of some of my many conversations.

Key point: Always use clear-concise-unambiguous-explicit.

Note: Some differences are not fully visible unless you see the markdown text behind, for that I provide this


Summary: (My opinion) Every part that is fully-clear-unambiguous-explicit is savings in cognitive load to ‘ingest the prompt’, which allows for more credit to be used in ‘performing the task’. Another way you can think of it is: If you were to write a program to parse the prompt, how complex would that be / how much time would you need? If the answer is ‘almost nothing’, then you got a good prompt.



Enclosed better than non-enclosed

Better

Task:

blah blah blah
blah blah blah

Worse

Task:

blah blah blah
blah blah blah



Keep the title out of the enclosed part

Better

Task:

blah blah blah
blah blah blah

Worse

#### Task:

blah blah blah
blah blah blah


Explicit format better than unknown format (EVEN if it is markdown/text)

Better

Task:

blah blah blah
blah blah blah

Input:

blah blah blah
blah blah blah

Worse

Task:

blah blah blah
blah blah blah

Input:

blah blah blah
blah blah blah
1 Like

There is an additional recipe I have been refining over time.

This is placed at the end of the user_message, just after the # Input


# Output and Format

(you don’t see but it uses ```yaml to enclose the content)

varI1: str  # Verbatim from the Input
varI2: str  # Verbatim from the Input

[varS0: bool  # Optional. A step towards the solution. Provide reasoning for it as well.]

varR1: str  # Reasoning behind the choice of varS1.
varS1: bool  # The final solution you want (let's call it the Solution).

validateR1: str # A question that when answered confirms that the solution meets your requirements.

Explanations:

The variable names must be clear, concise, and relevant to the task. This is essential. You can iterate on this by generating variations and scoring them across metrics such as clarity, conciseness, and task relevance.

GROUNDING

  • varI1 and varI2: These represent the “GROUNDING” or “ANCHORING” variables. While optional, they help the LLM reconnect with the Input, especially in scenarios with multiple examples (e.g., 5, 25, or more). These variables ensure the Output remains focused on the relevant Input. Use only as many as necessary—keep them minimal.

REASONING (CoT: Chain-of-Thought)

  • varR1: This forces the LLM to explain its reasoning for the solution (varS1). Reading this reasoning is critical for identifying potential weaknesses in your prompt.

If the reasoning is sound but you disagree with the solution, then the problem lies in your prompt, not the LLM. This variable encourages the LLM to justify its choices, which leads to a more thoughtful response and helps it “chew” the solution thoroughly before “spitting it out.”

SOLUTION

  • varS1: The final solution you want. This can be a boolean, string, or any format needed for the task.

  • varS0: Optional. Use this for intermediate responses that bring the LLM closer to generating a better varS1.

Example:

second_person_addressed: bool  # Is a second person (singular or plural) being addressed directly or indirectly?
T_V_applicable: bool           # Does the translation allow for formal/informal register distinctions?

Here, formality usually arises when “you” (directly or indirectly) appears in the phrase. Answering the first question (second_person_addressed) helps the LLM get closer to answering the second (T_V_applicable).

VALIDATION

validateR1: str

Just a str is fine, but you will probably want to verify that the answer is ~ "yes, the solution complies with the requirements of … because … and … "

To do the check we transform it to

validateR1:  # choose a good name
  valid: bool # Does the chosen .... comply with .... 
  reason: str 

Now, when processing the output you can assert that validateR1.valid is True and if not, act in consequence.

1 Like

Examples in Your Prompt (k-shot / n-shot method)

Theoretical Approach

In theory, examples in your prompt should closely match the final Input and Output. However, this can lead to certain issues.

Practical Considerations

1. Long or Repeated Input Parameters

If one parameter in the Input is excessively long and repeated across every example, it can overshadow the rest of the content. This dilutes the relevance and clarity of the example.

2. Reasoning in Examples

While reasoning is useful for generating outputs, including it in examples can cause unnecessary distraction and increase cognitive load. This may reduce the overall effectiveness of the prompt. Idem with the Grounding parameters (useless once they played their role).

Adjusted Approach

To address these issues, I’ve started omitting GROUNDING and REASONING variables from the examples, focusing solely on the SOLUTION. Consider the problem as A-B-C: show A and C, while leaving out B if it doesn’t serve as an optimal intermediate step.

Handling Large or Irrelevant Input Parameters

For Input parameters that are irrelevant or too large for the examples, you can replace them with placeholders such as None or null. This maintains the structure of the Input while simplifying the example, making it easier to “relate examples.”


Example

Original Example:

Input:

paragraph: str
q1: str
q2: str

Output:

a1: str
a2: str

If the paragraph is too large, the relationship between q1/q2 and a1/a2 becomes diluted.


Revised Example:

Input:

paragraph: None
summary: str # Here you provide a short text that acts as replacement for the long `paragraph`.
q1: str
q2: str

Output:

a1: str
a2: str

This adjustment preserves the Input’s structure while avoiding the distraction of overly lengthy parameters.


Key Principle

It’s crucial that all Inputs share the same structure, even if some variables are replaced with None, null, or NaN. This approach is common in machine learning (ML) and deep learning (DL) systems and ensures consistency while reducing noise.

I know the Final Output has a few more parameters than all the Examples, but your interest is in HOW the variables that matter are generated, and that is what the Examples are there for, to show the Output = function (Input) whatever function is.



Other approach

If your k-shots have a common parameter, you could include it in the system_message

system:
 - role
 - task
 - common_info
 - examples of input / output (the common_info applies to all)

user:
 - input
 - output and format

A you build more and more verified examples, the new ones are surrounded by so many deeply-in-context examples, that doing the task wrong becomes difficult.

Some may say that this feels like overfitting. Somehow it is. But the fitting is done by the careful selection of examples close to the solution.

1 Like

I’ll throw you a bone, because atleast your thinking… If your goal is a true NLP Control Language with proper directives that optimizes compute and most of the time (theirs exceptions) tokenization of a system prompt. The key is universal and no contextual understanding. What I mean by that is, if I took one of your directives (what I call them) and I were to indeoendtly give it to claude, PI, Gemini… could they clearly, with very little reasoning tell you what that directive meant and how to process it?

For example, independently if you were to take r1= and feed it to claude, or even back to a new chat session with chatgpt… would it be able to define it’s meaning and tell you how to execute it without much reasoning? If the answer is no, what you are saving in tokenization is costing more in processing and computer. This leads to less predictable outputs based on guessing what it might mean…

Now, sometimes optimization means you use more tokens than the original prompt. The key here is making sure that the consistency of the output based on needs justifies that additional use of tokens. If you are writing reddit posts, probably not. If you are parsing highly technical documents into structured data for research of new drugs to cure Eboli… Pile on the tokens, we care about execution right?

So, what is your purpose? I am not sayin you’re wrong or don’t have any valid points. Nothing like that, just thowing in some insight on a topic I enjoy. Message me anytime, glad to chat!

A few more bones since we are on the same wave length… Just things to think about…

  • Potential for Over-Constraint

You strongly advocates “enclosing” tasks/outputs, labeling variables, and specifying optional fields (e.g., varI1, varI2, varR1, varS1).

  • Why It’s Negative: In more open-ended, exploratory tasks, forcing everything into an enclosed structure might reduce the model’s creativity or nuance. Users might lose valuable content if they strictly conform to these structures for every scenario.

  • Confusion Over Reasoning Visibility

  • You encourage explicitly capturing “Reasoning” (like varR1) to help see the chain-of-thought behind a solution.

  • Why It’s Negative: Many real-world deployments don’t want chain-of-thought fully exposed to end users (for IP, privacy, or clarity reasons). If you follow these guidelines verbatim, you risk oversharing or cluttering the final output with internal reasoning.

  • Token Overhead & Prompt Inflation

  • You suggests a multi-field approach, complete with optional placeholders, “GROUNDING” variables, and example blocks.

  • Why It’s Negative: All this structure can quickly inflate your prompt length, especially for multi-shot examples or large tasks. Hitting token limits or overshadowing the main content is easier if you’re not careful.

  • No Discussion of Multi-Turn / Conversation Flow

  • Much of what you said prescribes a single-block, self-contained approach—like “Better to keep the title out of the enclosed part” and “Output and Format.”

  • Why It’s Negative: Real usage often involves multi-turn Q&A. The text doesn’t address how to pause, ask clarifying questions mid-flow, or handle partial updates. If you assume everything happens in one shot, you could hamper tasks that need real back-and-forth.

  • Heavy Reliance on Rigid Variable Naming

  • You use varI1, varI2 for input, varR1 for reasoning, etc.

  • Why It’s Negative: This naming scheme might feel arbitrary or get confusing for large projects with more fields or domain-specific data. There’s also no direct mention of how to adapt these variable names if you have more or different fields to capture.

  • Limited Mention of Error / Edge Case Handling

  • You focus on systematically structuring inputs, but has less emphasis on how to handle ambiguous user input, missing data, or contradictory instructions.

  • Why It’s Negative: Without a clear fallback or error-handling mechanism, the LLM might produce incomplete or confusing results (like ignoring a variable when it’s missing or uncertain).

  • Implied One-Size-Fits-All Approach

  • Your examples and general statements read like the correct or “better” way to build prompts.

  • Why It’s Negative: Not every use case benefits from the same level of enclosure. Some tasks are simpler, some are more complex, and an overly formulaic approach might lead to suboptimal outcomes if the nature of the LLM query is highly flexible or creative. .

It is stated on the first post.

In other words, sharing what I have been finding, distilling, and works well for me.

As with everything, it must be taken with a ‘pinch of salt’ as it depends where/how is going to be used.

I appreciate your thoughts / comments.

I broadly agree with them.

Depending on the problem at hand some strategies/approaches have to be taken in or out, like not exposing the reasoning, verifying that the output conforms to expectations (using response_format), etc.

My observations/findings are just that, not THE template for how prompts should be done.

Hi there !

It is working for prompting Real time API to follow better the prompt for the outband and inbound calls ?

Did you have any tips ?

thanks !

Building the solution, stay tuned…