Some simple (and obvious) best practices for the Prompt

I have in Playground a few Presets that I often use to inquire about LLM preferences.

The answers are very educative. Here is my humble summary of some of my many conversations.

Key point: Always use clear-concise-unambiguous-explicit.

Note: Some differences are not fully visible unless you see the markdown text behind, for that I provide this


Summary: (My opinion) Every part that is fully-clear-unambiguous-explicit is savings in cognitive load to ‘ingest the prompt’, which allows for more credit to be used in ‘performing the task’. Another way you can think of it is: If you were to write a program to parse the prompt, how complex would that be / how much time would you need? If the answer is ‘almost nothing’, then you got a good prompt.



Enclosed better than non-enclosed

Better

Task:

blah blah blah
blah blah blah

Worse

Task:

blah blah blah
blah blah blah



Keep the title out of the enclosed part

Better

Task:

blah blah blah
blah blah blah

Worse

#### Task:

blah blah blah
blah blah blah


Explicit format better than unknown format (EVEN if it is markdown/text)

Better

Task:

blah blah blah
blah blah blah

Input:

blah blah blah
blah blah blah

Worse

Task:

blah blah blah
blah blah blah

Input:

blah blah blah
blah blah blah
1 Like

There is an additional recipe I have been refining over time.


Output and Format

(you don’t see but it uses ```yaml to enclose the content)

Output:
  varI1: str  # Verbatim from the Input
  varI2: str  # Verbatim from the Input
  [varS0: bool  # Optional. A step towards the solution. Provide reasoning for it as well.]
  varS1: bool  # The final solution you want (let's call it the Solution).
  varR1: str  # Reasoning behind the choice of varS1.

Explanations:

The variable names must be clear, concise, and relevant to the task. This is essential. You can iterate on this by generating variations and scoring them across metrics such as clarity, conciseness, and task relevance.

GROUNDING

  • varI1 and varI2: These represent the “GROUNDING” or “ANCHORING” variables. While optional, they help the LLM reconnect with the Input, especially in scenarios with multiple examples (e.g., 5, 25, or more). These variables ensure the Output remains focused on the relevant Input. Use only as many as necessary—keep them minimal.

SOLUTION

  • varS1: The final solution you want. This can be a boolean, string, or any format needed for the task.

  • varS0: Optional. Use this for intermediate responses that bring the LLM closer to generating a better varS1.

Example:

second_person_addressed: bool  # Is a second person (singular or plural) being addressed directly or indirectly?
T_V_applicable: bool           # Does the translation allow for formal/informal register distinctions?

Here, formality usually arises when “you” (directly or indirectly) appears in the phrase. Answering the first question (second_person_addressed) helps the LLM get closer to answering the second (T_V_applicable).

REASONING (CoT: Chain-of-Thought)

  • varR1: This forces the LLM to explain its reasoning for the solution (varS1). Reading this reasoning is critical for identifying potential weaknesses in your prompt.

If the reasoning is sound but you disagree with the solution, then the problem lies in your prompt, not the LLM. This variable encourages the LLM to justify its choices, which leads to a more thoughtful response and helps it “chew” the solution thoroughly before “spitting it out.”

1 Like

Examples in Your Prompt (k-shot / n-shot method)

Theoretical Approach

In theory, examples in your prompt should closely match the final Input and Output. However, this can lead to certain issues.

Practical Considerations

1. Long or Repeated Input Parameters

If one parameter in the Input is excessively long and repeated across every example, it can overshadow the rest of the content. This dilutes the relevance and clarity of the example.

2. Reasoning in Examples

While reasoning is useful for generating outputs, including it in examples can cause unnecessary distraction and increase cognitive load. This may reduce the overall effectiveness of the prompt. Idem with the Grounding parameters (useless once they played their role).

Adjusted Approach

To address these issues, I’ve started omitting GROUNDING and REASONING variables from the examples, focusing solely on the SOLUTION. Consider the problem as A-B-C: show A and C, while leaving out B if it doesn’t serve as an optimal intermediate step.

Handling Large or Irrelevant Input Parameters

For Input parameters that are irrelevant or too large for the examples, you can replace them with placeholders such as None or null. This maintains the structure of the Input while simplifying the example, making it easier to “relate examples.”


Example

Original Example:

Input:

paragraph: str
q1: str
q2: str

Output:

a1: str
a2: str

If the paragraph is too large, the relationship between q1/q2 and a1/a2 becomes diluted.


Revised Example:

Input:

paragraph: None
summary: str # Here you provide a short text that acts as replacement for the long `paragraph`.
q1: str
q2: str

Output:

a1: str
a2: str

This adjustment preserves the Input’s structure while avoiding the distraction of overly lengthy parameters.


Key Principle

It’s crucial that all Inputs share the same structure, even if some variables are replaced with None, null, or NaN. This approach is common in machine learning (ML) and deep learning (DL) systems and ensures consistency while reducing noise.

I know the Final Output has a few more parameters than all the Examples, but your interest is in HOW the variables that matter are generated, and that is what the Examples are there for, to show the Output = function (Input) whatever function is.