Tools for testing prompts

pranavr · July 20, 2023, 9:04pm

Hello everyone,

I have been working on prompts for some time now. We are building a summarization feature, and to achieve the required accuracy, we have tried multiple variations of the same prompt. However, the testing process has been cumbersome, and we have been directly calling the API to test it out and sharing the prompts via a text file on GitHub.

Is there any tool that can help us maintain the versions of the prompts we use, like a “CodePen for prompts”?

Thanks

vb · July 20, 2023, 9:51pm

I asked ChatGPT4 about this:

Hello,

Indeed, there are tools available that could help streamline your process for maintaining and versioning your prompts. Here are a few suggestions:

Git and GitHub: You’re already using GitHub, which is an excellent start. Git is perfect for version control of text files. You could create a more organized repository structure for different prompt versions and use branches for developing new prompts before merging them into the main codebase. Combine this with a markdown file or use GitHub’s wiki feature for documentation to track changes over time.
Google Colab: Google Colab is a great tool for this purpose, especially if you’re doing a lot of Python programming. You can save your prompts and the code you’re using to test them all in one place, and it’s easy to roll back to previous versions. Plus, Colab notebooks are easy to share and collaborate on.
Jupyter Notebook: Similar to Google Colab but more local-based, Jupyter notebooks are great for prototyping and testing code. You can save your prompts as markdown cells and have your code in code cells.
DVC (Data Version Control): This tool is designed specifically for versioning data and machine learning models, but it could also be used for your prompts. It integrates well with Git.
Prompt Management Tool: If none of these solutions meet your needs, you might consider developing a custom “Prompt Management Tool”. This could be a simple web application that allows you to input, version, and test prompts. You might also add features for rating prompt performance or grouping prompts by project or theme.

Remember to keep the prompts and any data confidential and secure, especially when using public tools and repositories.

I hope you find these suggestions helpful for your project.

_j · July 21, 2023, 3:14am

Here’s a tool that might be good for various prompts, but it really doesn’t have good code-like version control or forking abilities built in.

Qualities:
Open source
Native application
Cross platform
can be local DB, or client server, or shared DB
Different note view types, graphical relationships of trees and links
Can launch documents to apps

Meh:
Poor wysiwyg, wordpad beats it
Just give some friggin folders
Everything, even program options is a note

Oddness:
Everything in the tree is a “note” of various types, not folders
Child objects get added to the parent note view as a sort of in-note panel
Doesn’t use system fonts
Gonna take at least an hour to orient and discover what’s going on in this thing

Bad for code:
“snapshots” are only done automatically, only good for reverting
You can clone/copy notes, but have to make your own tree
Barely better than saving file versions

Might be good for the "paste different things into ChatGPT crowd

kjordan · July 21, 2023, 5:44am

Hey @pranavr That’s exactly why I built Knit, it’s a management and development tool for prompt designers and teams, with all the functions you mentioned. Also it’s free for everyone. Lmk if you need any help.

wswitzer · July 30, 2023, 1:55am

Many prompt engineering IDE’s available

aakash · September 8, 2023, 3:52pm

Check out Weave

Its part of the Weights&Biases stable

brayanhernandez · September 27, 2023, 8:47pm

Let’s try promptfoo is a CLI and library for evaluating LLM output quality.

sergiocandrade · November 9, 2023, 12:36am

Have ou seen Spellchain? This should be useful.

Hyperloop · December 18, 2023, 4:20am

I’m just keeping my stuff organized in obsidian for now because it’s just markdown files in a physical folder structure under the hood, which means I can migrate to another tool later on, use git to track my changes and add my own scripting on top of what I have now easily in the future.

I feel as if the state of the art is going to evolve quite quickly so I’m trying to stay as far away from tool/vendor lock in as I can for the time being so that I can adjust easily as things change.

Innovatix · December 19, 2023, 9:28am

This application allows you to evaluate the performance of multiple models, different Large Language Models (LLMs) on your prompt.

toni88x · January 12, 2024, 2:24pm

Take a look at PROMPTMETHEUS. It’s a full-fledged IDE for prompt design and testing, including composability, versioning, statistics, and collaboration for teams. It also connects to all other major LLM providers for cross-platform testing. If there’s anything missing that you need, please let me know so that I can add it.

duncantmiller · February 1, 2024, 12:04am

Check out Shiro its a dev platform for prompt engineering. You can test variations on prompts against all the major LLM providers. You can pass variables into your prompts and include complex logic like conditionals (if/else), for loops and even operations like capitalize.

Once you are ready to use a prompt in production you can deploy it so you can access it and use it via API.

ivanpashenko · July 3, 2024, 8:02am

Oh man, this is such a familiar pain. We tried a bunch of tools to manage our prompts, but nothing quite hit the spot. So, of course we built our own solution Now we’re kind of in love with it. It’s made our lives so much easier:

Keeping prompts organized in separate projects
Saving test chats and rerunning them with one click when we tweak the prompt
Using reusable variables
Versioning prompts and test chats
A smart assistant that sees all your files and helps with editing and brainstorming (it’s surprisingly powerful)

We think it’s pretty good, so we want to invite some people to try.
You can check it out at tune.ac
We’d love to hear what you think

Topic		Replies	Views
Managing prompts in production Prompting api , prompt , prompt-engineering	11	4337	January 22, 2025
Tools for Testing Custom GPT Prompts Prompting prompt-engineering	55	15233	March 12, 2025
How to update your GPT prompt without redeploying your app? API api	8	1151	January 22, 2025
How is everyone managing their prompts? Prompting chatgpt , prompt , llmops	5	1352	April 8, 2024
Reusable System Prompts - Can it be done? Prompting api	6	1139	July 3, 2024

Tools for testing prompts

Related topics