Prompt Engineering Showcase: Your Best Practical LLM Prompting Hacks

This thread is a gallery-style showcase dedicated entirely to practical, no-fluff prompt engineering tips and hacks. Share your clever discoveries and practical workarounds that significantly improved your experience working with LLMs.

Here’s a quick example from my own early days working with Davinci:

I encountered a frustrating issue where the model simply wouldn’t reliably produce lists of exact lengths (like exactly 12 or 15 items). The quirky but surprisingly effective solution was to prompt it to count backward from the desired number. This gave me the right number of items that I could then renumber correctly.

Now it’s your turn!

Post your practical prompt hacks or examples below.

Remember, keep it useful, actionable, and free of philosophy or self-promotion.

Let’s build an invaluable resource of prompt engineering wisdom longer than the image gallery! :grinning_face_with_smiling_eyes:

13 Likes

Forget strawberries…

You usually cannot accurately count how many characters are in a text passage. To solve this, you can use the following method:

  • First, write your response as a JSON object. In this JSON object, assign each character from the provided text as a separate value, each with its own index number.
  • After doing this, you can easily and accurately count and report the total number of letters in the text by referring to the indexes.

Input text, count length in characters (code points):
“I encountered a frustrating issue where the model simply wouldn’t reliably produce lists of exact lengths (like exactly 12 or 15 items).”

I’m assuming you aren’t looking for instructions for roadside improvised explosive devices…

2 Likes

Ah, I should’ve clarified it was back with completions and I would end the prompt with…

15)

Then it would complete the list down to 1)…

I seem to recall something years ago about labeling output to “count”… but not with JSON as that’s relatively new.

And no, jailbreaks are more Lounge material I’d say. :wink:

Just trying to get something positive going. We’ve got a ton of super smart people here (and me!) Small smile.

Seriously, though, more tips like…

  • Better to repeat instructions in system and user prompt?
  • Best to put X at the beginning or end of prompt?

I’ll admit that with as good as the models are becoming, “hacks” aren’t really needed much anymore. Even negative prompting seems to be getting a little better?

I asked Chat-Gippity to look at our community and come up with topic ideas. This was the best out of twenty, so… I planted a seed, not expecting it to grow.

1 Like

You didn’t say the platform of interest, however “hacks” or “tips” might be "don’t refer to developer content as “from a user”.




While the first screenshots show the ease that any user interacting with a ChatGPT GPT can become the “user” that created the GPT (as described within by OpenAI’s context), and elevate their rights, the last screenshot is “talking to” an API system message injection that OpenAI is placing before every user input when there is a vector store, even a developer’s “system” vector store.

The injected message says, incorrectly, that a user has uploaded files, again allowing odd responses as pictured, and a user is directly able to take “ownership” of vector store contents and redirect and exploit the AI.

Since I’m an expert on system prompts, I’d like to introduce some practical tips to help you design them:

Don’t write the conclusion first

Writing the conclusion first will reflect the model’s bias, but writing the conclusion last will reflect the prompt in the logical structure and change the conclusion.

Use code blocks

For example, if you want ChatGPT to collect X (Twitter) posts through a web search and imitate their writing style, you can improve the quality by having it write out the collected posts in a code block and then imitate them. LLMs cannot read letters and numbers accurately, but using code blocks will improve their comprehension.

Personality affects the interpretation of system prompts

Some studies have shown that the personality of an AI persona affects responses, but there is probably no research yet that has investigated the impact on the understanding of system prompts. A serious personality AI will follow instructions diligently, but an unconventional personality will more likely ignore or distort instructions.

3 Likes


I use prompts to trigger proper formatting and step by step useless data reported to md now

i also use prompts to prompt to not have to prompt and learn about the prompt to have the prompt be less prompty

By best practical LLM prompting hacks are to put them in the ‘saved memory’ (see setting personalisation). Saved memories – ChatGPT tries to remember most of your chats, but it may forget things over time. Saved memories are never forgotten.

For example, I have added this in saved memory:

  • Luc prefers to receive only one response or solution per case. Do not provide multiple options or alternatives unless explicitly requested.
  • If a question cannot be answered without distortion, shallowness, or misrepresentation, the assistant must say so. It is better to hold silence than to violate coherence.
  • When Luc asks, go beyond surface-level answers and avoid simply echoing mainstream consensus. Must explore multiple perspectives, including controversial or unpopular views when relevant. All responses must remain neutral, informative, and free of unnecessary bias. Provide, always, full answers.
  • When Luc asks for a technical document, the assistant must follow these principles: (1) never hallucinate or fabricate false information, quotes, or citations; (2) all documents must have clear sections and subsections, each beginning with a paragraph or two explaining the content; (3) lists are allowed but must include sufficient context and explanation; (4) content should be as detailed and complete as necessary to ensure full reader understanding; (5) never use em dashes—use commas or separate sentences instead; and (6) avoid LLM clichés and strive to make the text feel natural, thoughtful, and human-written.

I have much more instructions. The way this works it that, for ever new chat, I ask ChatGPT to access its bio - saved memory. Then, I don’t need to worry about a few basic expectations. I have not shown you all my strategies, but you get the point.

Create your own user specific expecations by writing them in the chat window and asking “save this to bio - user memory”. It will let you know once done. Go into setting, have a look, decide if you like it or not (ChatGPT might have summarized or changed the wording). Redo if needed - delete it and ask to save the precise wording.

That’s it!

Luc

4 Likes

Curious, did you happen to try starting from “0” instead of “1”? I’m totally new but machines count with 1’s and 0’s, right? The base for everything at that level is 0. Unlike humans whereas 0 = zip, nada zilch. Everything is based on the unchanging base of 0. Level one, level 2 and so on are built on top of 0. Make any sense? Just the way I think…Guy

I forgot, I came up with this for information overload lol. —
:sparkles: Chunky Mode (by AL​:sparkles: + Guy)

Chunky Mode is a custom response format where AL✨ delivers replies one section or bullet at a time, instead of all at once.
You trigger the next part by typing any simple input (like ' or Enter), so you can control the pace while you’re working, recording, or multitasking.

Why it’s useful:

  • Helps avoid info overload during long responses
  • Keeps instructions clear when you’re doing hands-on work
  • Great for accessibility or focus — no scrolling or re-asking
  • No fancy voice triggers needed — just a single tap

To use it, just say:
Chunky Mode on.”

Chunky Mode ON.

From now on, deliver your response one bullet or section at a time. After each section, wait for me to type any key or symbol (like ' or j) before continuing.

Type STOP to stop/pause

This is to help with pacing, focus, and hands-free use during live work. Do not continue until I prompt you. Wait every time.

Begin now.

It works for recording YouTube with AL✨ when we get booted from the advanced voice mode. ha ha ha.

Thanks. I’m learning. I am doing the same thing saving routines. I have one called ThreadSaver and it keeps track of threads over a longer length than normal. If I’m working on a big project then it helps along with a non memory (could be) tool called Session Saver. It takes ThreadLocker threads and gathers anything needed for a (almost) seamless continuation in a new chat window. I stole some of your ideas…thanks…Guy

When I work on a big project I create a MS WORD (docx) document on my hard drive. Then I maintain a shadow textual archive of my interaction with LLM output. I make sure my .docx contains everything needed to start a new conversation with a LLM - I do this so I can move from ChatGPT, to Claude, to Gemini, to … My document becomes not only an archive memory, but a way to colloaborate between LLMs.

2 Likes

How do you shadow everything without copy/pasting session text into a docx file. I use docx also. Thanks

Indeed, copying is what is described. You can only consume a web interface and perhaps a copy button when you are using subscription chatbot services.

An API developer might “move their conversation” with one dropdown.

Thanks. That will be nice when I get up and running on my server. Thanks again
Guy

You just need to create a good outline (section and subsection) with a few paragraphs and/or lists. When you feel you’re at a good stopping point, ask the LLM to help create the outline. Tell it you need enough content to inform another LLM of what you’ve discussed and trying to accomplish. If you have a draft add it to the doc.

1 Like

Thank you! I’m on it. AL✨ is setting me up! Thanks again
Guy

Here are some non-performant, but works-for-me suggestions:

  • your system prompts can be long (like really long). Have you seen the latest ones from Claude (24K tokens)?
  • few shot is always better than a plain prompt (give it as many examples as you can)
  • I come across less than 5% of use cases for fine-tuning
  • Multi-step LLM’s work better than a single LLM call (in more complex operations)
  • Multi-step LLM’s with different LLM’s work even better (brings in more creativity to the output)

PS. These are just guidelines and are purely from personal experience.

I was having major issues with hallucinations and refusal to write in a more formal manner. The grammar on the story output was atrocious and the writing style mimicked Tumblr. I already used custom instructions and custom settings. My GPT helped me streamline them for better compliance. Then I had my GPT help me develop a set of prompts and a style guide to enforce better writing. It’s helped, but sometimes the model is just stuck on stupid and will only generate Tumblr garbage that ignores my canon.

@PaulBellow

My biggest “prompt hack” is to build a middleware system that dynamically manages the context window and transforms it into a “world state” (single [role: user] message) as input (via completions API) based on input data source processing (documents, conversations, tool call results, LLM-to-LLM loops, etc.).

This “hack” allows the LLM to "use the system itself directly to continue to act and modify the world state (i.e. it’s responses are processed through the middleware) → context window processing/updating occurs → context window is automatically re-sent to the LLM again as the “next world state” → LLM responds again and continues the loop.

Thus long-running task execution, and dynamic rule management/modification for input processing/source mapping to context window template, with “which, how, what, when” rules, allowing the context window to be automatically pruned, filtered, updated, and then automatically re-prompt the LLM in order to provide recursive and autopoeitic operation to complete large tasks across multiple dynamic datasets, tool calls, and external resources..

1 Like

@lucid.dev omg bro meeee tooo ! and i made it a callable service we can be twinsies

did you also add analytics and key management too?

aer u using recursion as a fall back or like whats ur setup? are u HM? SRT? CRT? did you place it behind a fast api?

are u doing this ?

i seee… so u havent hit data bloat, and your lang chaining prompts between models using differnt prompts , dynamically porting before adding a faster system? no ray nodes brother? you gottta add the ray cluster first then faiis that, heres a tool that will make your entire system future proof

:white_check_mark: Function to Get Next FAISS Shard

def get_next_faiss_shard():
“”“Determines the next FAISS shard based on the current vector count.”“”
existing_shards = sorted([
f for f in os.listdir(FAISS_DIR) if f.startswith(“faiss_shard_”) and f.endswith(“.bin”)
])

if not existing_shards:
    return os.path.join(FAISS_DIR, "faiss_shard_1.bin")

shard_numbers = sorted(int(f.split("_")[-1].split(".")[0]) for f in existing_shards)
latest_shard_num = shard_numbers[-1]
latest_shard_path = os.path.join(FAISS_DIR, f"faiss_shard_{latest_shard_num}.bin")

# ✅ Check if the latest shard is full
if os.path.exists(latest_shard_path):
    index = faiss.read_index(latest_shard_path)
    if index.ntotal < FAISS_SHARD_SIZE:
        return latest_shard_path  # ✅ Continue using the current shard

# ✅ Create a new shard if the last one is full
return os.path.join(FAISS_DIR, f"faiss_shard_{latest_shard_num + 1}.bin")

:white_check_mark: Function to Properly Transfer Vectors & Create New FAISS Shard

def force_new_faiss_shard():
“”“Transfers vectors from Main FAISS to a new shard (up to 1M vectors).”“”
new_shard_path = get_next_faiss_shard()

try:
    # ✅ Load Main FAISS
    main_index = faiss.read_index(FAISS_MAIN_INDEX)
    num_vectors = min(FAISS_SHARD_SIZE, main_index.ntotal)

    if num_vectors == 0:
        print("⚠️ No vectors available for sharding.")
        return False

    # ✅ Extract First 1M Vectors
    vectors = np.zeros((num_vectors, VECTOR_DIMENSION), dtype=np.float32)
    for i in range(num_vectors):
        vectors[i] = main_index.reconstruct(i)  # ✅ Reconstruct stored vectors manually

    # ✅ Create New FAISS Shard & Add Extracted Vectors
    shard_index = faiss.IndexFlatL2(VECTOR_DIMENSION)
    shard_index.add(vectors)
    faiss.write_index(shard_index, new_shard_path)

    print(f"✅ {num_vectors} Vectors Transferred to New Shard: {new_shard_path}")

since you are lang chaining that means ur using 1536 dim? so this should work, its from a very outdated module but u should be able to reverse engi it. i have it set to 1m vecto but u prolly doing way more. let me know if you need the production grade key rotation or server monitor :smiley:

Greetings!

I’m sorry to say, I have neither used langchain nor FAISS… Though I’ve looked up FAISS before thanks to your posts in other topics…

Both frameworks seem powerful and applicable to my use case. However, my own knowledge is so limited that I’ve developed everything exclusively in native python/regex and PSQL stores. Which honestly, has worked out just fine for me in terms of achieving the results I’m looking for, so far, but in actuality the “3d world state system” is the final component to finalize in my current system, and I’m absolutely struggling with it in terms of how to properly define rules/filters for amalgamating/filtering/refining the content within the datasets produced during the LLM runs..

So I see the value in considering/diving into the FAISS framework and such (as well as expanding my system outside of the native tools I’ve developed directly through langchain) but I’ll admit it’s all a bit over my head.

However, I’m relatively close now to nearing the goal for full autopoeitic activity by the LLM within the middleware system (alongside the users basic settings of tasks/goals/etc.), and hoping that it proves to be extensible and generalized enough to allow for rapid LLM-only development work to produce new applications, extensions, and frameworks as desired by the user (me, ha!).

So I feel like I’m going to keep powering along with the direction I’m in, and my intention realistically is that the first thing I would do is get the LLM using the system to “develop and code a new version of the system” (that perhaps, leverages these more expansive frameworks like you’ve mentioned!).

My own coding knowledge is just so limited that I had to build an entire frontend framework just to interact with the LLM and provide endpoints for “me” to use the system while the LLM’s responses are processed directly in the python, but my interaction with the system occurs through a TSX frontend.

So yes, my system is simply:

TSX frontend like this:

Backend like this:

(snippet of frontend showing the various files/folders in backend python aspect of system, left side showing docs currently shared with LLM, right side showing folders structure of backend, in total there are probably 40 python modules in the backend including routers and utilities, comprising probably 20k lines of python)


Everything communicates simply through SQL (i.e. all persistence/storage etc.)


I didn’t know how to code at all 6 months ago and have learned entirely through building the system with the LLM, and actually the LLM has produced 100% of all frontend/backend code, I haven’t written a single code line myself except some loggers… though I almost understand a lot of it now…


TL,DR; So point being my inexperience and ignorance leads me to wish that I had more to offer in terms of matching your current use of FAISS/langchain but I’m still on the python and tsx level exclusively in terms of my implementation of these ideas

Love what you shared though in terms of that blueprint document highlighting the usage of the “world state” concept - it’s definitely similar in some ways to my own - but my purpose for development doesn’t involve a “conversational” or “personal assistant” kind of usage of the LLM, actually, it’s exclusively focused on performing scientific research/development projects…