Promptrix: a library for building prompts

This is still a bit of a work in progress but thought I’d share a new prompt building library I’m working on called Promptrix.

Promptrix is a a prompt layout engine for Large Language Models. It approaches laying out a fixed length prompt the same way a UI engine would approach laying out a fixed width set of columns for a UI. Replace token with character and the same exact concepts and algorithms apply. Promptrix breaks a prompt into sections and each section can be given a token budget that’s either a fixed set of tokens, or proportional to the overall remaining tokens.

All prompt sections are potentially asynchronous and rendered in parallel. Fixed length sections are rendered first and then proportional sections are rendered second so they can proportionally divide up the remaining token budget. Sections can also be marked as optional and will be automatically dropped should the token budget get constrained.

Promptrix also supports generating prompts for both Text Completion and Chat Completion style API’s. It will automatically convert from one style prompt to the other while maintaining accurate token counting.

I have a number of examples on the readme for my github repo but here’s what standard prompt might look like:

const prompt = new Prompt([
  new SystemMessage(`The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly.`, 500),
  new ConversationHistory('history', 1.0),
  new UserMessage(`{{$input}}`, 100)
]);

This prompt has SystemMessage section, ConversationHistory section, and a UserMessage section. The SystemMessage has a fixed token budget of 500 tokens and the UserMessage has a fixed token budget of 100 tokens. These sections will be rendered first and any remaining token budget will be given to the ConversationHistory because it’s a stretch section with a span of 100%.

Here’s another example that uses a custom PineconeMemory section to add semantic memories to the prompt:

const prompt = new Prompt([
  new PineconeMemory(<pinecone settings>, 0.8),
  new ConversationHistory('history', 0.2),
  new SystemMessage(`Answer the users question only if you can find it in the memory above.`, 100),
  new UserMessage(`{{$input}}`, 100)
]);

Again, the fixed sections will get rendered first but this time the PineconeMemory and ConversationHistory sections will share the remaining token budget with an 80/20 split…

I have a bunch of basics implemented and just need to bang out a few unit tests. Hope to have something published this weekend. I also think that my friend who ported Vectra to python is planning to work up a python port of Promptrix so hopefully that will land soon as well.

5 Likes

I’m building a AI persona framework with prompt chaining and advanced concepts like ReAct, MRKL, and goal creation. I’m definitely going to steal some of your design patterns. I really like the level of abstraction you have going on.

Good stuff.

1 Like

Thanks… I didn’t really showcase it but a Prompt is itself a PromptSection so you can actually compose prompts within prompts… I don’t know how generally useful that is but I always try to design my SDK’s with composability in mind.

I’m planning to add a GroupSection class though which should be super useful… This will let you take a span of sections and give the overall group a token budget they have to share. The group sections will be rendered as text and then returned as a single outer message… This is basically how you could take a bunch of separate individual sections and render them as a single “system” message…

The ability for any section to be rendered either as text or an array of messages really lets you do some super powerful composability.

I should also add that Promptrix was born from my frustration of constantly having to create two different code paths for rendering a prompt; one for Chat Completions and separate one for Text Completions.

Next step is to create an OpenAI client that simply takes a PromptSection + config as input and I never have to think about two code paths again :slight_smile:

1 Like

I got all my unit tests written and Promptrix is published to NPM. For some reason the readme isn’t showing up on NPM (I opened a support issue) but you can find detailed usage instructions on my repo page.

I have 100% unit test coverage so it should be fairly close to production ready:

UPDATE: the readme is showing up now. must have been a caching issue.

1 Like

I have a new “highly opinionated” client library I’m working on called AlphaWave that builds on Promptrix. An AlphaWave is a prompt that’s bound to an LLM and is self repairing. You can pass a validator plugin into your AlphaWave and should the response from the LLM fail validation, I’ll fork the conversation history and automatically attempt to repair the models output using a technique I call “feedback”.

My goal for AlphaWave is to make it the most reliable LLM client on the planet. And by reliable, I mean that it only passes output that has passed validation and it will do its best to auto repair the LLM’s output should it fail validation.

AlphaWave will also have Auto-GPT style Agent support using everything I’ve learned from my Self-INSTRUCT prototype. It’s a long weekend here in the states so hope to have something runable by Monday :slight_smile:

2 Likes

Nice. I did the “feedback” method a couple months back, but with no library. I just hardcoded it. Never thought to have module or generic method built around it. Good stuff, I’ll definitely check it out.

1 Like

Yeah I was doing “feedback” without a library as well… And actually, one of the primary reasons I made Alpha Wave was to have a cleaner mechanism for forking the conversation so I could give the model feedback in isolation from the main thread.

I’ve found that if you give the model feedback and it works, great, but if it doesn’t work the model will actually develop a complex of sort that you’re never going to get it to shake. Forking the conversation, keeps the main conversation history free of any hallucinations and if for some reason feedback doesn’t work, you can simply try your call again and given the stochastic nature of the model it will probably just work.

You actually need both Feedback and Retry logic though. Feedback can work to talk the model out of something it would otherwise always hallucinate. Retry compensates for the stochastic nature of the model that 9 out of 10 times it makes a good decision but 1 out of 10 times it makes a bad decision.

I just published version 0.2.0 of Promptrix which has a couple of nice enhancements:

  • Sections can now be rendered using auto, fixed, or proportional sizing strategies. Auto is the default and the section will be rendered as is. Fixed sections can be capped and will have their content truncated if the run over, great for capping the length of user messages. Proportional sections will be dynamically rendered based on remaining input token budget.
  • Prompts can now be composed hierarchically. You can nest prompt within prompts within prompts and they’ll all roll up to a single message array or text based prompt.
  • New GroupSection lets you take any list of prompts or sections and render them to a single “system”, “user”, or “assistant” message.
  • Comments for most classes and methods so better vs code auto completion.

These changes really improve the compositional abilities of Promptrix and the tweaks to the sizing strategies let you really tune how your prompt squeezes information into an LLM’s context window. You should easily be able to create prompts that never blow up because they’re too big.