I just published the first draft of my new agent framework that’s built on top of Alpha Wave. This new framework enables the creation of semi-autonomous agents, capable of completing complex multi-step tasks. Unlike other agent experiments we’ve seen so far, like AutoGPT and BabyAGI, AlphaWave agents should actually complete the task they’re presented with more often then not. I can get into the technical details for why it’s superior to other agent frameworks, but let me just show you some code.
This is the entire TypeScript code for a simple agent that’s an expert at math:
import { Agent, AskCommand, FinalAnswerCommand, MathCommand } from "alphawave-agents";
import { OpenAIClient } from "alphawave";
import { config } from "dotenv";
import * as path from "path";
import * as readline from "readline";
// Read in .env file.
const ENV_FILE = path.join(__dirname, '..', '.env');
config({ path: ENV_FILE });
// Create an OpenAI or AzureOpenAI client
const client = new OpenAIClient({
apiKey: process.env.OpenAIKey!
});
// Create an agent
const agent = new Agent({
client,
prompt: `You are an expert in math. Use the math command to assist users with their math problems.`,
prompt_options: {
completion_type: 'chat',
model: 'gpt-3.5-turbo',
temperature: 0.2,
max_input_tokens: 2000,
max_tokens: 1000,
},
initial_thought: {
"thoughts": {
"thought": "I need to ask the user for the problem they'd like me to solve",
"reasoning": "This is the first step of the task and it will allow me to get the input for the math command",
"plan": "- ask the user for the problem\n- use the math command to compute the answer\n- use the finalAnswer command to present the answer"
},
"command": {
"name": "ask",
"input": { "question":"Hi! I'm an expert in math. What problem would you like me to solve?" }
}
},
logRepairs: true,
});
// Add commands to the agent
agent.addCommand(new AskCommand());
agent.addCommand(new FinalAnswerCommand());
agent.addCommand(new MathCommand());
// Listen for new thoughts
agent.events.on('newThought', (thought) => {
console.log(`\x1b[2m[${thought.thoughts.thought}]\x1b[0m`);
});
// Create a readline interface object with the standard input and output streams
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
// Define main chat loop
async function chat(botMessage: string|undefined) {
// Show the bots message
if (botMessage) {
console.log(`\x1b[32m${botMessage}\x1b[0m`);
}
// Prompt the user for input
rl.question('User: ', async (input: string) => {
// Check if the user wants to exit the chat
if (input.toLowerCase() === 'exit') {
// Close the readline interface and exit the process
rl.close();
process.exit();
} else {
// Route users message to the agent
const result = await agent.completeTask(input);
switch (result.status) {
case 'success':
case 'input_needed':
await chat(result.message);
break;
default:
if (result.message) {
console.log(`${result.status}: ${result.message}`);
} else {
console.log(`A result status of '${result.status}' was returned.`);
}
// Close the readline interface and exit the process
rl.close();
process.exit();
break;
}
}
});
}
// Start chat session
chat(`Hi! I'm an expert in math. What problem would you like me to solve?`);
Half of the code is just managing the command line interface. The part we care about is the construction of the Agent
instance. To define an Agent you give it a client, a very simple prompt, some options for which model to use, and for gpt-3.5-turbo
a super important initial thought (I use gpt-4 to generate this initial thought as it’s way better at creating a plan that gpt-3.5 will follow.)
You then need to add the commands to the agent that you want it to use… You should always include a FinalAnswerCommand
, otherwise the agent doesn’t know how to finish the task, and I recommend adding an AskCommand
as the model will use that if it needs more input from the user to complete its task or even just when it gets confused. From there you can add any other commands you’d like. This sample uses the MathCommand
which lets the model pass in small snippets of JavaScript for evaluation.
Let’s see a sample run of this code:
I have 4 basic commands implemented so far AskCommand
, FinalAnswerCommand
, MathCommand
, and PromptCommand
with more on the way.
I’m all about composition though, so Agents themselves are commands. That means you can define child Agents which complete a task that parent Agents can initiate. What’s cool is the agents actually interface with each other via natural language. They just talk to each other. I’ll try to cook up a sample that shows this off…
Speaking of samples, you can find the math agent sample here and I have several other samples coming. I have a flight booking agent, a Macbeth agent which involves 16 agents that can act out any scene from the play Macbeth, a programmer agent that can write and test simple JavaScript functions, and an Agent Dojo sample that is my spin on BabyAGI and is an agent that can train other agents to become experts on any given subject (machine learning, music theory, etc.) Look for all those to start dropping over the next week or so.