Alpha Wave: A state-of-the-art library for talking to LLM's

Sharing my new library, called AlphaWave, which is a highly opinionated library for talking to Large Lange Models, like GPT. Calling it state-of-the-art could be a stretch but I know a lot about talking to these models, I design SDK’s for a living, and this is the most capable SDK I’m either aware of or can design at this point.

The library is published to NPM here and there’s a node.js sample you can run here. The sample is basically a terminal version of ChatGPT and it’s less about the sample I’ve provided and more about what this simple library is capable of. Here’s the quick feature breakdown:

  • Supports calling OpenAI and Azure OpenAI hosted models out of the box but a simple plugin model lets you extend AlphaWave to support any LLM.
  • Promptrix integration means that all prompts are universal and work with either Chat Completion or Text Completion API’s.
  • Automatic history management. AlphaWave manages a prompts conversation history and all you have todo is tell it where to store it. It uses an in-memory store by default but a simple plugin interface (provided by Promptrix) lets you store short term memory, like conversation history, anywhere.
  • State-of-the-art response repair logic. AlphaWave lets you provide an optional “response validator” plugin which it will use to validate every response returned from an LLM. Should a response fail validation, AlphaWave will automatically try to get the model to correct its mistake.

It’s the response repair logic that’s the key bit and I’d encourage you to go read the writeup on the repo for more details as to what I do to get the model to correct its output. In a nut shell; I first detect that the model made a mistake via validation, then I isolate the error by forking the conversation history, then I attempt to get the model to correct its mistake via “feedback”.

For those that just want to see code; here’s the setup logic for defining a simple ChatGPT style wave:

import { OpenAIClient, AlphaWave } from "alphawave";
import { Prompt, SystemMessage, ConversationHistory, UserMessage, Message } from "promptrix";

// Create an OpenAI or AzureOpenAI client
const client = new OpenAIClient({
    apiKey: process.env.OpenAIKey!
});

// Create a wave
const wave = new AlphaWave({
    client,
    prompt: new Prompt([
        new SystemMessage('You are an AI assistant that is friendly, kind, and helpful', 50),
        new ConversationHistory('history', 1.0),
        new UserMessage('{{$input}}', 450)
    ]),
    prompt_options: {
        completion_type: 'chat',
        model: 'gpt-3.5-turbo',
        temperature: 0.9,
        max_input_tokens: 2000,
        max_tokens: 1000,
    }
});

And then all you have to do is call a single completePrompt() method with the users input and then process the waves output:

// Route users message to wave
const result = await wave.completePrompt(input);
switch (result.status) {
    case 'success':
        console.log((result.response as Message).content);
        break;
    default:
        if (result.response) {
            console.log(`${result.status}: ${result.response}`);
        } else {
            console.log(`A result status of '${result.status}' was returned.`);
        }
        break;
}

Here’s a screenshot of the sample output:

I’m about to start coding an AutoGPT style Agent Framework for Alpha Wave that actually works. Expect that to drop tomorrow :slight_smile:

5 Likes

I think you should refrain from the use of the term state-of-the-art.

It’s not really something you get to anoint yourself with. You need to first establish it is objectively the best at something measurable first.

That’s fair… As I said it could be a stretch and likely short lived… With that said, it’s a very capable SDK.

The measurable aspect of things is that I’ve got the approaches leveraged in this SDK to make over 500 sequential model calls without a hallucination I couldn’t walk the model back from using “feedback”.

1 Like

Once I get the Agent Framework implemented tomorrow I’ll have some concrete demos that you can run to see the model hallucinate and then walk itself back from that hallucination. I have a Macbeth demo that involves 16 agents which can act out any scene from the play Macbeth. There’s a super reliable hallucination I’ll leave in the demo and show how you can use feedback to walk it back from this hallucination.

4 Likes

I’ll be checking this out tomorrow. Sounds like a huge aid in my development work!

4 Likes

Thanks Bruce… Let me know if you need any advice. It should be pretty straight forward but just DM me if you get stuck. I’m working on my Agent stuff so I’ll be online all day. TIP: you can add logPrompt: true to your client options to get a console dump of the model calls.

I think the Agent bits are just going to be a basic implementation of my Self-INSTRUCT prototype. That’s what I can bang out in a day to let the community start playing with. I have a bunch of thoughts for a more advanced Agent architecture but it will take me longer to build that so know that this is just a check-point release that should still be more capable then other architectures like AutoGPT or BabyAGI.

As a bit of a preview, my next-gen Agent system will likely be based off the dialog system I designed for Bot Builder. Bot Builder Dialogs are essentially the Photoshop of Conversation Management Systems. They’re highly scalable and support advanced features like interruptions and orchestration. You can think of Agents as self contained Dialogs that auto-orchestrate their calls to other Dialogs/Agents. There’s a feature that I originally designed for Cortana called consultation that’s particularly interesting in the context of Agents. Consultation will allow Agents to discuss things with other Agents which should be interesting :slight_smile:

1 Like

I published version 0.2.0 of AlphaWave today which includes a new JSONResponseValidator class. You can pass this validator into an AlphaWave and not only are you guarantied to always get JSON back from the LLM, but if you give it a JSON schema you will only ever get JSON that matches your schema back from the model:

interface AgentThoughts {
    thoughts: {
        thought: string;
        reasoning: string;
        plan: string;
    };
    command: {
        name: string;
        input: Record<string, any>;
    }
}

const agentThoughtsSchema: Schema = {
    type: "object",
    title: "AgentThoughts",
    description: "thoughts about the current state of the agent",
    properties: {
        thoughts: {
            type: "object",
            properties: {
                thought: {
                    type: "string"
                },
                reasoning: {
                    type: "string"
                },
                plan: {
                    type: "string"
                }
            },
            required: ["thought", "reasoning", "plan"]
        },
        command: {
            type: "object",
            properties: {
                name: {
                    type: "string"
                },
                input: {
                    type: "object"
                }
            },
            required: ["name", "input"]
        }
    },
    required: ["thoughts", "command"]
};

const thoughtValidator = new JSONResponseValidator(agentThoughtsSchema);

You can then create a wave with that validator like this:

// Create a wave
const wave = new AlphaWave({
    client,
    prompt: new Prompt([ ... ]),
    prompt_options: { ... },
    validator: thoughtValidator
});

This is where you’ll really start to see the power of AlphaWave. The JSONResponseValidator uses a fuzzy JSON parser which can correct for many of the common structural issues LLM’s have with returning JSON. They’re always forgetting to close off their JSON and such so the fuzzy parser just fixes those issues. If the validator fails to find any JSON in the returned response or if it finds JSON but that JSON fails validation, AlphaWave will fork the conversation history to isolate the mistake, and then it will start giving the model “feedback” to try and get it to correct the mistake. GPT-4 will typically correct its mistake in a single turn but even GPT-3.5 can do it, it can just take a few turns.

I’m planning to add a hybrid mode that will let you use a low cost model, like GPT-3.5, for your main conversation/task, but then switch to using GPT-4 when processing feedback. Conceptually, this is similar to a child asking a parent for help when they get stuck on a problem. This will let you keep cost down and processing speed up when things are working correctly and only pull out the big guns when the model gets stuck.

4 Likes

Perfect. That is how I often use the API, split based on my experience of task difficulty (eg, gpt-4 to develop specifications, gpt-3.5 to code).

Unfortunately, the code is often longer than the specs, and so I have to revert to gpt-4 just for the longer context. Will your hybrid mode be able to take that difference into account when deciding to switch?

1 Like

That’s an interesting suggestion… On a related note, you can also potentially use a classifier to decide which model to use based on the complexity of the users input. I’ll add a feature suggestion to add support for a PromptSelector to AlphaWave lets you programmatically select the prompt and prompt_options that should be used for a given request. Should be an easy enough add as the DefaultPromptSelector could always just choose the configured one.

I’m also seeing ChatGPT UI with plugins doing some type of repair itself now, if the response from the service comes back as an error, it attempts to fix the request with something else.

Not surprising as it works. I know that Microsoft Copilot also does something similar. I’m just trying to make it easy for anyone to do these more advanced repair techniques. AlphaWave does them automatically so you don’t even need to think about it…

The project looks interesting. You mention history and templates. I’ll explain my case, as I’m unsure if your project could help. I have a node.js app. I use templates to construct the prompt. The values chosen for the templates are based on the initial user input. Other values may be interjected into subsequent prompts randomly to create variation. I have a problem with the chat history I’m trying to solve, where it looses context after about 3 interactions. I’ve tried doing summaries of the chat history and inserting it into the message stack but haven’t gotten good results as of yet.

Alpha Wave builds on my other project Promptrix for prompt management. Promptrix is a layout engine for creating prompts that can automatically scale the prompt to fit a max_input_tokens budget. You break your prompt into sections and Promptrix will drop optional sections or truncate sections (coming soon) as you start running out of tokens. It also has built-in conversation history management that will auto fit the conversation history to the prompt based on the number of remaining input tokens.

I’m curious how long your prompt is that you’re running out of token budget after only 3 turns. I can typically fit anywhere from 5 - 10 turns of history but it just depends on the complexity of prompt and the length of the responses.

Feel free to DM me if you’d like.

2 Likes