Convo: a conversational programming language

stevenic · October 12, 2024, 10:23pm

From the Lex Fridman podcast with the Cursor team when they discuss The Future of Programming

I would say… Let’s just revisit this thread in 6 months and see where we are,

anon25271712 · October 12, 2024, 10:40pm

why have an API key set up like this instead of using dotenv?

mitchell_d00 · October 12, 2024, 10:42pm

This is an example of what I do with plain language logic in just a standard chat gpt4o instance.

This is o1 preview in session trained . Just some copy paste logic I made.

My programmed model it knows its function at hello.

stevenic · October 12, 2024, 10:49pm

That was just a quick single page app. There’s no node.js server so the only thing you have to work with is local storage. The key gets stored locally so it’s actually safe…

If this was something I ended to share or publish I’d obviously go the node.js route. This app is personalized to me so why do I need to complicate it with a server an everything? local storage is fine.

anon25271712 · October 12, 2024, 11:04pm

oh it’s not shared, I see! thanks for the clarification

there are a lot of scams recently where they ask for a key and that results in it getting leaked to third parties, even some that the mods have taken down, hence my questioning

stevenic · October 13, 2024, 12:29am

Great point… I probably should have the UI point out that the key is only saved locally.

I should point out that OpenAIs realtime console sample does the same thing. It prompts you for a key and then stores it in local storage.

anon25271712 · October 13, 2024, 12:51am

welp, if you are going to have someone put their keys on something, you most likely want to have them to create a .env

having a text box that asks for a key, even if it just saves locally, always raises suspicion

one alternative, is to host it as a freemium and provide the key.

another is to use an open source LLM, such as ollama, that way there is no api key involved (and also no cost associated to it)

stevenic · October 13, 2024, 1:03am

Point taken… this was a quick sample that if you ever were to actually run it would be very clear that your key isn’t going anywhere. There’s no web server involved. It’s a static file called “invoice-builder.html”. I could put the key in the .env file but the browsers not going to let me read it.

If you’re worried about the web page posting it to some other service then you shouldn’t load that page. Putting the key in a .env file isn’t going to magically prevent the code that’s using the key from doing something evil.

anon25271712 · October 13, 2024, 1:06am

ah, I see, so I guess you are using a in line javascript script in the html file to send a request to the api

stevenic · October 13, 2024, 1:17am

Yes… just fetch() there are zero 3rd party scripts even being used.

Your points are valid. If it wasn’t obvious to you then it likely won’t be obvious to others. I should have better described what was happening. My bad.

The actual generated code wasn’t the point of the post. It was the fact that I simply described what I wanted the program to do in pseudocode and then I asked the model to convert that pseudocode to html. What I got out was a fully functional app that I actually used to generate and print an invoice.

anon25271712 · October 13, 2024, 1:25am

your conversation programming language is awesome man, hope you get to release it

sorry to get a bit too much into the security of it all, it’s some what important, so hopefully its somewhat valueable

so, to get back to the topic, do you plan on expanding it? or perhaps releasing it?

stevenic · October 13, 2024, 1:57am

No security is important so please don’t think the feedback isn’t appreciated. It’s secure but I can see how just looking at the screenshot you might question things…

There’s not really anything to release because it already works in every LLM powered chat experience. This is simply a prompting technique if anything. If you’re using tools like cursor, v0, ChatGPT+Canvas, Claude Artifacts, etc. to generate code, then you’re already doing this. You’re just (likely) using those tools without any real structure to how you prompt them for code. Convo adds that structure.

As for expanding it… That’s what I’m currently doing… I’ve started a Sections folder where I’m building up collections of reusable Convo parts. Sections are the equivalent of modules in most programming languages. If you’re building a UI and want a sidebar that opens & closes you can just add the Sidebar Layout section to your prompt and the code that comes back should have a fully functional sidebar expressed in whatever language/framework you’re working with. Again, there’s no magic that I’m doing here. The LLM just knows how to do this. You should be able to take that section and append it to your v0 prompt and it should add a collapsible sidebar to your v0 component.

Here’s a more explicit example that uses a section which all OpenAI LLM Support:

If you’re using python you can send the model a prompt like:

Create a .py file for making model calls:

Section: LLM Support

Use OpenAI for LLM support:
    - Add a configuration setting for specifying the API key (hidden). Link for creating keys is https://platform.openai.com/api-keys
    - Add a configuration setting for specifying the chat completion model: 
        - chatgpt-4o-latest (default)
        - gpt-4o-mini
        - o1-preview
        - o1-mini
    - Note that o1-preview and o1-mini don't support "system" messages or settings like "temperature".

You’ll get back a python class like this:

# Supported models
SUPPORTED_MODELS = {
    "chatgpt-4o-latest",
    "gpt-4o-mini",
    "o1-preview",
    "o1-mini",
}

# Models that do not support "system" messages or "temperature"
LIMITED_MODELS = {"o1-preview", "o1-mini"}

def __init__(
    self,
    api_key: Optional[str] = None,
    model: Optional[str] = None,
):
    """
    Initializes the LLMSupport instance.

    Args:
        api_key (Optional[str]): OpenAI API key. If not provided, it will be read from
                                 the `OPENAI_API_KEY` environment variable.
        model (Optional[str]): The model to use for chat completions. If not provided,
                               it will be read from the `OPENAI_MODEL` environment variable.
                               Defaults to "chatgpt-4o-latest".
    Raises:
        ValueError: If the API key is not provided or the model is unsupported.
    """
    self.api_key = api_key or os.getenv("OPENAI_API_KEY")
    if not self.api_key:
        raise ValueError(
            "OpenAI API key not provided. Please set the OPENAI_API_KEY environment variable "
            "or pass it as a parameter."
        )
    openai.api_key = self.api_key

    self.model = model or os.getenv("OPENAI_MODEL", "chatgpt-4o-latest")
    if self.model not in self.SUPPORTED_MODELS:
        raise ValueError(
            f"Unsupported model '{self.model}'. Supported models are: "
            f"{', '.join(self.SUPPORTED_MODELS)}"
        )

def get_chat_response(
    self,
    messages: List[Dict[str, str]],
    temperature: Optional[float] = None,
    max_tokens: Optional[int] = None,
    **kwargs
) -> str:
    """
    Generates a chat completion using the specified model.

    Args:
        messages (List[Dict[str, str]]): A list of messages for the chat completion.
        temperature (Optional[float]): Sampling temperature. Ignored for limited models.
        max_tokens (Optional[int]): The maximum number of tokens to generate.
        **kwargs: Additional arguments to pass to the OpenAI API.

    Returns:
        str: The generated response from the model.

    Raises:
        openai.error.OpenAIError: If an error occurs during the API call.
    """
    # Adjust parameters based on the model's capabilities
    if self.model in self.LIMITED_MODELS:
        # Remove system messages
        messages = [
            message for message in messages if message.get("role") != "system"
        ]
        # Remove temperature
        temperature = None

    try:
        response = openai.ChatCompletion.create(
            model=self.model,
            messages=messages,
            temperature=temperature,
            max_tokens=max_tokens,
            **kwargs
        )
        # Extract the assistant's reply
        return response.choices[0].message["content"].strip()
    except openai.error.OpenAIError as e:
        # Handle OpenAI API errors
        raise e

You could ask for that same component as JavaScript, Rust, C#, C++, etc… And you will get the appropriate code back. Again nothing special I’m doing other then I’ve worked out the exact chunk of text that will give you a fully functional component add to your project.

If nothing else just think of Convo as a library of snippets that can add well defined behaviors and components to any coding project.

The thing Im trying to work out is can you compose these chunks of pseudocode code together to build whole applications like you can compose modules/functions in a more traditional programming language.

The answer is yes but there are some best practices so I’m just trying to work out what those best practices are.

stevenic · October 13, 2024, 8:29am

I’ll share more details after I have this flushed out a bit more but using Convo I created a more capable version of swarm tonight… I started about an hour ago and have a working client and server without having to write a single line of code. In fact the client & server would have been able to talk to each other on the first shot but I didn’t specify the websocket port to use so it coded the server to port 3000 and the client to port 8080. I had it re-gen the client code to use port 3000 and it just connected and worked. Both sides written in pseudocode…

Ok it’s not the full implementation of swarm yet but it’s a transport system capable of rich multi-agent communication. So what features does it have?

Platform and language agnostic… Agents can be written in any programming language and can be hosted on a server, in a browser, on a phone, a TV, wherever.
Supports both WebSocket and REST transports. The clients prefer sockets but are fine using HTTP post and long polling.
Supports agent handoff but real redirection based handoff not the fake handoff that swarm does. You can handoff to an agent that’s on a completely different host.
Supports clarifications meaning that agents can call back and ask clarifying questions or for additional information.
Supports callbacks meaning that agents can hang-up the call while they go off an perform some long running task. They’ll call you back when they’re done.

My goal with this is to model every interaction that humans might do while performing tasks. All without having to write a single line of real code because why not…

For a bit more background… I designed this framework, this framework, and this framework plus a half dozen more not published to the web. All which do some variation of this. This will be like my 20th pass at such a framework which probably means that if I haven’t gotten it right yet, I’m probably not going to

I’ll share more when I have something more concrete working…

UPDATE:
I’m working on a central agent directory service that agents can register themselves with and lookup other agents. They’ll use natural language for both tasks…

That sets up the ultimate in flexibility. For example, an agent will be able to register itself for a particular task but then specify that it only works M-F from 9am - 5pm PST.

If anyone would like to discuss how identity in a decentralized agent system could work I’d love to chat. I’ve been exploring a PGP based system with some Proof of Work and key revocation extensions that GPT recommended.

reconsumeralization · October 13, 2024, 4:18pm

You funny man I made the type of version of that last night too lol

icdev2dev · October 13, 2024, 5:33pm

Inspired by this topic, here’s brief outline of a book that I am contemplating on writing (well with LLMs) generated by my brief description

Title: Bridging the Gap: Enhancing Human-LLM Communication Through Pseudocode Refinement

Table of Contents

Introduction

1.1 The New Era of Human-AI Collaboration
1.2 Challenges in Communicating with Large Language Models (LLMs)
1.3 The Role of Pseudocode in Natural Language Communication
1.4 Objectives and Structure of This Book

Part I: Understanding the Dynamics

Chapter 2: Large Language Models Explained

2.1 What Are LLMs and How Do They Work?
2.2 Tokenization and the Explosion of Possibilities
2.3 Strengths and Limitations of LLMs in Code Generation
2.4 Reflecting on AI Interpretation of Human Instructions

Chapter 3: The Nature of Pseudocode in Natural Language

3.1 Defining Plain Natural Language Pseudocode
3.2 Differences Between Formal Code and Pseudocode
3.3 Common Pitfalls in Pseudocode Communication with LLMs
3.4 Case Studies: Misinterpretations and Misalignments

Part II: Enhancing Communication Through Detail

Chapter 4: Identifying Missing Pieces in Pseudocode

4.1 Material vs. Nitty-Gritty Details: Understanding the Spectrum
4.2 Techniques for Spotting Omissions
4.3 The Impact of Missing Information on LLM Outputs
4.4 Examples of Critical Missing Elements

Chapter 5: Strategies for Effective Detailing

5.1 Prioritizing Information: What’s Essential?
5.2 Balancing Brevity and Clarity
5.3 Using Hierarchical Structuring in Pseudocode
5.4 Incorporating Contextual Clues for LLMs

Chapter 6: Collaborative Refinement with LLMs

6.1 Interactive Prompting Techniques
6.2 Asking the Right Questions to Elicit Details
6.3 Utilizing LLM Feedback to Improve Pseudocode
6.4 Iterative Refinement Processes

Part III: Bridging Gaps Before Code Generation

Chapter 7: The Explosion of Token Space and How to Manage It

7.1 Understanding Token Variability in LLM Responses
7.2 Controlling Output through Constrained Inputs
7.3 Examples of Managing Token Diversity

Chapter 8: Stitching Missing Pieces Effectively

8.1 Techniques for Material Additions
8.2 Addressing Nitty-Gritty Details
8.3 When to Let LLMs Fill in the Gaps
8.4 Avoiding Over-Specification

Chapter 9: Refining Pseudocode for Robust Interpretation

9.1 Standardizing Language and Terminology
9.2 Clarifying Ambiguous Instructions
9.3 Using Examples and Analogies
9.4 Testing Pseudocode with Multiple LLMs

Part IV: Practical Applications and Case Studies

Chapter 10: Real-World Examples

10.1 Case Study: Building a Simple Algorithm
10.2 Case Study: Complex Systems and LLM Collaboration
10.3 Debugging Pseudocode with LLM Assistance
10.4 Success Stories in Human-LLM Partnerships

Chapter 11: Tools and Resources

11.1 Platforms for Experimenting with LLMs
11.2 Pseudocode Editors and Linters
11.3 Communities and Forums for Collaboration
11.4 Further Reading and Advanced Topics

Conclusion

Chapter 12: The Future of Human-LLM Interaction

12.1 Emerging Trends in AI Communication
12.2 Ethical Considerations and Best Practices
12.3 Continuous Learning and Adaptation
12.4 Final Thoughts and Encouragement

Appendices

A.1 Glossary of Terms
A.2 Sample Pseudocode Templates
A.3 Reference Guides for LLM Prompts
A.4 Bibliography

stevenic · October 13, 2024, 5:44pm

Can’t wait to read it one thing I’d add is that it’s not just about using pseudocode to “instruct” machine based agents. We already use it to instruct other humans. “Go clean your room” is just a pseudocode based instruction that starts a room cleaning program in another human.

stevenic · October 13, 2024, 5:51pm

I saw that in your screenshot… Microsoft asked me to explore creating a new multi agent framework and convo was just something that emerged as part of that exploration. The fact Swarm was just released is an interesting coincidence. I dug into swarm and there’s not really anything overly groundbreaking in the ideas. The Routines idea I figured out in the AlphaWave Agents framework I built over a year ago. I called them Scripts but it’s the same step based instructions. AlphaWave does their limited form of in process handoff as well.

icdev2dev · October 13, 2024, 5:52pm

True. In the introduction below, I think I should nuance that a little more.
Introduction

1.1 The New Era of Human-AI Collaboration

The rapid advancement of artificial intelligence has ushered in a new era where humans and machines collaborate more closely than ever before. Large Language Models (LLMs), such as GPT-4, have become powerful tools capable of understanding and generating human-like text. They assist in drafting emails, writing code, creating art, and even offering recommendations. This symbiotic relationship holds tremendous potential, but it also presents unique challenges that stem from the fundamental differences in how humans and AI interpret and generate language.

1.2 Challenges in Communicating with Large Language Models

While LLMs are adept at processing vast amounts of information and generating coherent responses, they rely heavily on the input they receive. Ambiguities, omissions, or imprecise instructions from users can lead to outputs that deviate from the intended goals. Unlike humans, who can infer context and fill in gaps based on shared experiences or intuition, LLMs interpret language based on learned patterns from their training data. This difference often leads to an “explosion of token space,” where the AI explores numerous possible interpretations, sometimes resulting in undesirable or unexpected outcomes.

1.3 The Role of Pseudocode in Natural Language Communication

Pseudocode serves as a bridge between human logic and machine execution. It allows us to express algorithms and processes in a way that is abstract enough to be easily understood by humans yet structured enough to be translated into actual code. When communicating with LLMs, using pseudocode in plain natural language can enhance clarity. However, the lack of precision and the omission of critical details can hinder the AI’s ability to generate accurate and functional code.

1.4 Objectives and Structure of This Book

This book aims to explore the intricate dynamics of human-LLM communication, particularly focusing on how we can refine our use of pseudocode to improve the collaborative process. By identifying common pitfalls and providing strategies to enhance the clarity and completeness of our instructions, we can harness the full potential of LLMs in code generation and other complex tasks.

The book is structured into four parts:

Part I: Understanding the Dynamics – We delve into the workings of LLMs and the nature of pseudocode in natural language, setting the foundation for effective communication.
Part II: Enhancing Communication Through Detail – This section offers practical strategies for identifying missing pieces in pseudocode and refining instructions to bridge gaps.
Part III: Bridging Gaps Before Code Generation – We explore how to manage the explosion of token space, effectively stitch missing pieces, and refine pseudocode for robust interpretation.
Part IV: Practical Applications and Case Studies – Real-world examples and case studies illustrate the concepts discussed, providing actionable insights for readers.

By the end of this book, readers will have a comprehensive understanding of how to communicate more effectively with LLMs, leading to more accurate outputs and a more fruitful human-AI partnership.

stevenic · October 13, 2024, 6:20pm

This probably isn’t directly related to your book but I thought I’d point out that we want to start capturing all of these pseudocode->code example pairs so that we can fine tune better coding models. All of the current generation of coding models are trained just off the code half of the equation. They’re missing the intent which is captured in the pseudocode.

If you start capturing the intent->code mappings I suspect that not only will you end up with a model that’s SoTA at coding tasks, but it will likely be SoTA at reasoning because the two tasks are closely intertwined.

reconsumeralization · October 14, 2024, 10:12am

I’ve been working on some pretty cool waiting stuff for new consoles about multi agent memory grabs that are editable of through a dragon drop Mewtwo effective dec isions in the future from the past

Topic		Replies	Views
Super Simple PHP / Vanilla Javascript for DALLE3 API (+ Programming Languages Debate!) API dalle3 , dalle , api-beginners	36	1226	January 13, 2025
Phasm - Macro Assembler of User Concepts Community chatgpt , project , macros , phasm	29	1045	April 24, 2025
Prompt Engineering Showcase: Your Best Practical LLM Prompting Hacks Prompting prompt , prompt-engineering , prompt-hacks	61	17861	October 15, 2025
Codepilot: GitHub Copilot on Steroids Community development	27	27929	October 12, 2024
Codex CLI programming game in Godot Coding with ChatGPT codex-cli	20	2277	November 5, 2025

Convo: a conversational programming language

Related topics