As promised here is an implementation of a command finder using embeddings to match the natural language task string to the best tool. Would greatly appreciate the feedback or contributions!

Right now there are a few terms embedded for “search web” “save to file” and “calculate expression”

3 Likes

Anxiously awaiting to see what you’ve got.

A couple of questions,

  1. Do you have a good interface built for adding additional commands/APIs? In particular I’m thinking about WolframAlpha which allows 2,000 free API calls/month. Wolfram|Alpha APIs: Computational Knowledge Integration
  2. I know you’re targeting gpt-4 for this, but there may be some benefit to offloading some of the calls to gpt-3.5-turbo for does and cost savings. I’m imagining using some external logic to determine which model to dispatch calls to.

My default is actually gpt-3.5-turbo and it works well enough for many scenarios. one trick is I use GPT-4 to generate the initial chain of thought which I then pass into 3.5-turbo. This seems to help it follow instructions better…

1 Like

I absolutely see this working but I wonder if the limitations are around attention and the LLMs ability to know what its changed are affecting. Perhaps working from an outline that contains test results and digging down from there as a means of debugging whenever it makes a breaking change?

There’s lots of room for improvement here… Especially with regards to having the model write and test more complex programs.

To better answer your question though… For this task in particular you could probably run the entire outer loop using gpt-3.5-turbo. I’d definitely have the fixCode command and probably even the writeCode commands use GPT-4. For fixCode in particular you need the deep analysis capabilities of GPT-4 to have it work out the issue with the code. You might could use gpt-3.5-turbo for writeCode but I wonder if you might get closer to a working solution on the first try just using gpt-4. The testCode command doesn’t make model calls at all so no concerns there…

Commands will be super easy to add… It’s basically just a typescript interface that looks like this:

interface Command<TInput extends object, TContext> {
    execute(input: TInput, memory: Memory, context: TContext): Promise<string>;
}

I’m planning to add a validate() method that lets the command validate the input parameters from the model before execute() is called. This will let you use JSON Schema to validate that the model isn’t hallucinating missing, additional, or invalid parameters. If the validation fails I have some techniques I can use to get the model to correct the input its passing…

I have implement a low-code version of AutoGPT using the PromptAppGPT framework.

PromptAppGPT significantly lowers the barrier to GPT application development, allowing anyone to implement their own AutoGPT-like application with 70 lines of low code.

The code:

---
author: Leo
name: My AutoGPT
description: Use gpt and executors to autonomously achieve whatever goal you set.
gptRound: multiple
failedRetries: 2
autoRun: true

sysTask:
  - executor: gpt
    prompt: |
      Constraints:
      1. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
      2. No user assistance
      3. Exclusively use the commands listed in double quotes e.g. "command name"

      Commands:
      1. Webpage Search: "doSearchWeb", args: "query": "<keywords_to_search>"
      2. Image Search: "doSearchImage", args: "query": "<keywords_to_search>"
      3. Task Complete: "doCompleteTask", args: "output": "<task_output>"

      Resources:
      1. Internet access for searches and information gathering.
      2. GPT-3.5 powered Agents for delegation of simple tasks.

      Performance Evaluation:
      1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
      2. Constructively self-criticize your big-picture behavior constantly.
      3. Reflect on past decisions and strategies to refine your approach.
      4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

      You should only respond in JSON format as described below 
      Response Format: 
      {
          "thoughts": {
              "text": "thought",
              "reasoning": "reasoning",
              "plan": "- short bulleted\n- list that conveys\n- long-term plan",
              "criticism": "constructive self-criticism",
              "speak": "thoughts summary to say to user"
          },
          "command": {
              "name": "command name",
              "args": {
                  "arg name": "value"
              }
          }
      }

userTask:
  - trigger: doSearchWeb
    executor: bingWeb
    prompt: |
      query: $e{"query": "(.*)"}
      limit: 2
    outputer: $e{RawInput} doGptNext
  - trigger: doSearchImage
    executor: bingImage
    prompt: |
      query: $e{"query": "(.*)"}
      limit: 2
    outputer: $e{RawInput} doGptNext
  - trigger: doGptNext
    executor: gpt
    prompt: Determine which next command to use, and respond using the format specified above.
  - trigger: doCompleteTask
    executor: log
    prompt: |
       $i{Task Complete:@textarea=$e{"output": "(.*)"}}
  - executor: gpt
    prompt: |
      $i{My Objectives:@textarea=Objectives:
      1. Recommend the best smartphone for business professionals in 2023.
      2. Explain why the smartphone is recommended and show the smartphone's image.}

The Running Results:

1 Like

The My AutoGPT based on the PromptAppGPT framework:

  1. Use the web search to get latest smartphones
  2. Use the image search to get the image of the iPhone 14 Pro Max
  3. Complete the task with the explnation of the recommendation and the url of the recommended smartphone.

@stevenic Hi Steven, I was wondering if you had any progress so far since some time has passed. Also I cannot find any code on your github page.

I’d be happy to know the current status of your project.

Thank you!

So the project is alive and well. It’s all under my AlphaWave project:

I have a full agent framework that works as well as any agent out there. Lately I’ve been focusing on RAG and I’m working to combine state-of-the-art RAG techniques with state-of-the-art agent techniques.

2 Likes

I’m really close to a general solution for reasoning over large corpuses of documents. I’ve essentially worked out the AI equivalent of map reduce for reasoning.

1 Like

Also my latest project, Codepilot, should drop tomorrow or Monday. Codepilot is basically GitHub Copilot on steroids. It uses my Vector DB, Vectra, and is an expert on your specific codebase. This is the first nail in the profession of developer.

3 Likes

Sounds very useful. I’ll be looking for the thread…

1 Like

I built the POC last week and I’m about 80% finished with the initial implementation. It will all be CLI based to start but no complex setup processes like GPT Engineer. You’ll need an OpenAI key but the tool will walk you through setting everything up and indexing your codebase.

Nice. I’ve got about 15k lines of this roguelike now…

I have a lot of ideas for how to improve Codepilot but it’s already impressive what its capable of. It laterally understands your code and can tell you how to add any new feature.

1 Like

Here’s a preview of the POC running in Teams. We’ve started porting the iOS Teams client from Objective C to Swift and we’re using it to do most of the porting. Our devs implemented the core logic for our adaptive card renderer and Codepilot is more than capable of writing the renders for all the individual elements.

You can also ask it to make improvements like adding a border property:

1 Like

Let it create a new branch, commit the change and create a merge request.

#stopcopyandpaste!

1 Like

PR creation is on my todo list as is Modifying existing files. It will be able to create files out the gate similar to what GPT Engineer does. Modifying files is a trickier problem so I just want to take some time to work through making sure modifications work correctly and reliably.

1 Like