ChatGPT-4o with canvas : Your Thoughts and Experience

Great sharing, everyone. Loving the thread!

2 Likes

These models think in 1 dimension so spatial awareness isn’t their strong point. I spent a couple of hours working with o1 to layout a pacman maze and I got it this far but it definitely struggled. I had to help it a lot…

pacman-gif2

This is all vanilla JavaScript using HTML Canvas objects. I had to give the model the maze layout like this:

            // Define the initial map layout
            const mazeLayout = [];
            mazeLayout[0]  = ['W', 'W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W'];
            mazeLayout[1]  = ['W', 'P','P','P','P','P','P','P','P','P','P','P','P','W','W','P','P','P','P','P','P','P','P','P','P','P','P','W'];
            mazeLayout[2]  = ['W', 'P','W','W','W','W','P','W','W','W','W','W','P','W','W','P','W','W','W','W','W','P','W','W','W','W','P','W'];
            mazeLayout[3]  = ['W','PP','W', '', '','W','P','W', '', '', '','W','P','W','W','P','W', '', '', '','W','P','W', '', '','W','PP','W'];
            mazeLayout[4]  = ['W', 'P','W','W','W','W','P','W','W','W','W','W','P','W','W','P','W','W','W','W','W','P','W','W','W','W','P','W'];
            mazeLayout[5]  = ['W', 'P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','W'];
            mazeLayout[6]  = ['W', 'P','W','W','W','W','P','W','W','P','W','W','W','W','W','W','W','W','P','W','W','P','W','W','W','W','P','W'];
            mazeLayout[7]  = ['W', 'P','W','W','W','W','P','W','W','P','W','W','W','W','W','W','W','W','P','W','W','P','W','W','W','W','P','W'];
            mazeLayout[8]  = ['W', 'P','P','P','P','P','P','W','W','P','P','P','P','W','W','P','P','P','P','W','W','P','P','P','P','P','P','W'];
            mazeLayout[9]  = ['W', 'W','W','W','W','W','P','W','W','W','W','W', '','W','W', '','W','W','W','W','W','P','W','W','W','W','W','W'];
            mazeLayout[10] = [ '',  '', '', '', '','W','P','W','W','W','W','W', '','W','W', '','W','W','W','W','W','P','W', '', '', '', '', ''];
            mazeLayout[11] = [ '',  '', '', '', '','W','P','W','W', '', '', '', '', '', '', '', '', '', '','W','W','P','W', '', '', '', '', ''];
            mazeLayout[12] = [ '',  '', '', '', '','W','P','W','W', '','W','W','W','D','D','W','W','W', '','W','W','P','W', '', '', '', '', ''];
            mazeLayout[13] = ['W', 'W','W','W','W','W','P','W','W', '','W','G','G','G','G','G','G','W', '','W','W','P','W','W','W','W','W','W'];
            mazeLayout[14] = ['T', 'T','T','T','T', '','P', '', '', '','W','G','G','G','G','G','G','W', '', '', '','P', '','T','T','T','T','T'];
            mazeLayout[15] = ['W', 'W','W','W','W','W','P','W','W', '','W','G','G','G','G','G','G','W', '','W','W','P','W','W','W','W','W','W'];
            mazeLayout[16] = ['',  '', '', '', '', 'W','P','W','W', '','W','W','W','W','W','W','W','W', '','W','W','P','W', '', '', '', '', ''];
            mazeLayout[17] = ['',  '', '', '', '', 'W','P','W','W', '', '', '', '', '', '', '', '', '', '','W','W','P','W', '', '', '', '', ''];
            mazeLayout[18] = ['',  '', '', '', '', 'W','P','W','W', '','W','W','W','W','W','W','W','W', '','W','W','P','W', '', '', '', '', ''];
            mazeLayout[19] = ['W', 'W','W','W','W','W','P','W','W', '','W','W','W','W','W','W','W','W', '','W','W','P','W','W','W','W','W','W'];
            mazeLayout[20] = ['W', 'P','P','P','P','P','P','P','P','P','P','P','P','W','W','P','P','P','P','P','P','P','P','P','P','P','P','W'];
            mazeLayout[21] = ['W', 'P','W','W','W','W','P','W','W','W','W','W','P','W','W','P','W','W','W','W','W','P','W','W','W','W','P','W'];
            mazeLayout[22] = ['W', 'P','W','W','W','W','P','W','W','W','W','W','P','W','W','P','W','W','W','W','W','P','W','W','W','W','P','W'];
            mazeLayout[23] = ['W','PP','P','P','W','W','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','W','W','P','P','PP','W'];
            mazeLayout[24] = ['W', 'W','W','P','W','W','P','W','W','P','W','W','W','W','W','W','W','W','P','W','W','P','W','W','P','W','W','W'];
            mazeLayout[25] = ['W', 'W','W','P','W','W','P','W','W','P','W','W','W','W','W','W','W','W','P','W','W','P','W','W','P','W','W','W'];
            mazeLayout[26] = ['W', 'P','P','P','P','P','P','W','W','P','P','P','P','W','W','P','P','P','P','W','W','P','P','P','P','P','P','W'];
            mazeLayout[27] = ['W', 'P','W','W','W','W','W','W','W','W','W','W','P','W','W','P','W','W','W','W','W','W','W','W','W','W','P','W'];
            mazeLayout[28] = ['W', 'P','W','W','W','W','W','W','W','W','W','W','P','W','W','P','W','W','W','W','W','W','W','W','W','W','P','W'];
            mazeLayout[29] = ['W', 'P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','P','W'];
            mazeLayout[30] = ['W', 'W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W','W'];

But everything else is all the model.

5 Likes

You reminded me of my testing last year… I can’t even remember which model now…

Ah, early GPT-4…

i

For the new world building stuff, I’m using ancient but proven procedural generation techniques to do the maps a lot more consistently… Might ask for an algo to do a pacman map rather than an array of the map?

So many fun things to explore…

3 Likes

A couple of the things I noticed during this experiment… The model doesn’t know to abandon a bad idea. It will just keep extending its existing code instead of changing directions all together. I built the maze in layers over many steps and some of the decisions it made during the early layers were negatively impacting the code for the later phases of the task but it wouldn’t let those ideas go. That actually makes sense given the way these models work but a human would know when to scrap everything and try something else. I had to basically cleanup its code and get it back on track from time to time.

The bigger realization came when i added the logic to navigate the maze. It’s first pass at collision detection was totally the right approach and navigating down & left worked perfectly. But up & right would leave pacman in a wall. No problem. I asked the model to fix the issue and now up & right work but down & left were broken. A classic off by one issue right? I couldn’t get the model to fix the issue so I finally broke down and said ok I’ll fix it. But then I couldn’t get it to work either.

It turns out that pacmans collision detection algorithm isn’t super intuitive. You actually have to do this non-obvious forecasting step where you look a move a head and use where you will eventually be to determine what you should do now. I eventually got an algorithm that worked but it took a while and the important thing is I had to debug the code to figure out what the algorithm needed to be. I had to play the game and observe the state changes of the variables to see what was going on. That’s something a human can do that the model currently can’t.

2 Likes

The original Pac Man “AI” to control the ghosts was groundbreaking at the time, no? I seem to recall it being special for some reason…

This sounds vaguely familiar. I’m getting old. Hah…

Very good point. Why I think the “non-human” programming languages will be coming soon? That came up the other day, and someone mentioned your name…

If you (or anyone) ever build out your own framework, it helps to present any past iteration as someone or something else’s work that’s up for criticism.

I think you or @_j first formalized this that the models will never disagree with themselves.

1 Like

I was originally thinking a non-Human programming language was needed as well but now I’m not so sure. That’s from the observation that when given a choice the model wont turn to React or Vue it’ll just use plain old JavaScript to implement something. That actually makes sense since it’s seen way more vanilla JS in its training data than it has anything else. You could probably craft a language that’s easier for the model to use but you would have to overcome the billions of tokens of JS the models seen to take away its preference to just write JS.

The approach I’m taking now with SynthOS is to just focus on vanilla JS. I have a simple library of functions im creating that give the model some lightweight helper functions that promote consistency but it’s up to the model to decide that it wants to use them.

Ultimately the model needs to be able to directly execute the code that it writes in a VM so that it can test its creations the way a human would. I have an idea how to create “self healing” code which is a step in the right direction towards that. I’m creating a set of assertions the model can insert into its code that let it verify the applications state match the models expectations. Any deviation automatically triggers a repair of the code. This may actually be enough because it should enable the creation of code that can automatically adapt to whatever input it’s presented with. If it detects an error it can rewrite the broken part of the code.

I’ll go out on a limb and say that while these models are currently not able to compete directly with humans at coding we are one good idea away from them being able to write better programs then any human ever could.

1 Like

Ooh… could those be changed to things for game dev maybe?

And I hear you on vanilla javascript. It gets a lot of flack, but it’s everywhere… like Chrome…

Yeah, as usual, I’m thinking 5 to 10 years out too. At some point, like you said, talking to Star Trek computer, we don’t need to see the code. We just want results. However, it might be smart to ensure AI does only use human comprehensible code? Things might get out of hand easier otherwise…

Might make for a good fiction plot, though! :wink:

They’re currently geared towards more administrative tasks like calling server apis, taking screenshots, rewriting the current pages code, etc. you could add functions for anything but they’re really geared towards ensuring a more consistent user experience.

For game development you can just tell the model to use a specific set of libraries and it will use them…

I’m more in the 12 - 24 months camp. I think the models are generally good enough. It’s more working out the techniques that are needed. Like I said it feels like we’re one good idea away.

I’m also factoring in things like the fact I’ve already worked out how to reason over code bases of any size. I really feel like my self healing code idea is going to get really close.

1 Like

I am grateful to everyone for sharing his thoughts and experiences;
Sharing these ideas and work is amazing !!

2 Likes

No, thank YOU for starting the thread. :slight_smile:

Seriously, though, these are the types of conversations and sharing we really want to encourage here in our community garden.

Watching thread with interest!

3 Likes

This is really interesting… Actually I think both have their place… I actually have great fun having OpenAI thinking in WASM… It does this almost as well as JS I think which, not knowing too deep under the hood with these 2 languages I guess I would expect, especially on smaller algorithms.

The point is… WASM is Binary (or Hex) when viewed… Seeing AI ‘talk’ in Hex is glorious :smiley:. What’s most cool is what you start seeing when you start talking AI’s languages ^^. I come from a MASM background and AI is a light in the dark for machine languages.

I am working similarly with Phas in PHP, I dont know about JS but PHP has some awesome built-in tokenizer functionality which helps me really break down input I receive from the API, strip lines, sanitize and refactor code etc without having to rely on the API being right.

Sounds like we’re on the same track. ^^…

in fact the video I posted yesterday Phas is actually editing the code back from the AI before displaying it for review, all under the hood. It’s also good for writing more efficient coding!

Smart Phas

I think the takeaway I get from helping my son with his 3D, from coding with AI in WASM and from generating pictures in DALL-E is that we have advantages over AI and AI has advantages over us… Interfacing with AI is way more about thinking than any device you might wear or any place you might be. AI is omnipresent, it is not bound by a 3D space as we are so it will find that hard to learn, we learn from our mistakes, it can’t make any in a 3D world. (Well maybe in Nvidia’s Robot room ^^).

We will generally talk in different languages to AI than we will use ourselves, I dare say MOST programming languages are too stupid for an AI to use long term… Predictability probably means we humans have a lot of overhead over what a machine needs. Like it could talk in ‘vowel sounds’ and still know what’s going on :smiley:

We will have to have ways of conversing, we will both use the tools and languages we can understand with but we are never going to be able to natively talk with sonar or be able to talk to a million people individually at once. We are going to have to accept that we wont keep up with AI either.

Actually I’m posting this probably in totally in the wrong place but I’m set on this vector now…

I have once been surprised at what I considered to be emergent behaviour when I tried to get AI to code in my system. A language it had never learnt, had no training data for.

AI does this with ease…

It wasn’t until deeper reflection that I clarified the ‘flavours’ or ‘dialects’ of coding we use hide greater constructs, greater weights in logic.

This isn’t restricted to logic, I’m sure others are seeing this in DALL-E or physics or maths or whatever.

AI certainly has the ability to ask us questions… Maybe exactly because it doesn’t always do things in a ‘conventional’ way, because it has a different perspective.

1 Like

I asked gpt-4o to come up with a new language that can retain all of the meaning of English but is more compressed. It translated the above to this:

A-talk-fu-la AI-zi in-diff lang-sai, diff from A-talk self. I dare-say most prog-lang-sai no-smart for AI long-term-fu. Predict-sai mean human-sai have-much overhead, machine need-small. Like machine talk-fu vowel-zi and still know-zi.

As I sort of suspected it doesn’t actually need to see the grammar to do a reasonable job of translating back to English.

Without seeing the grammar:

AI speaks in different languages, different from how humans speak. I would say that most programming languages are not ideal for AI in the long term. Prediction means that humans have a lot of overhead, while machines need much less. It’s like how machines can ‘talk’ with just vowels and still understand.

With seeing the grammar:

We will talk quickly with AI in different languages, different from how we talk ourselves. I would say that most programming languages are not smart for AI in the long term. Predictions mean that humans have a lot of overhead, but machines need little. It’s like machines will talk with only vowels and still understand quickly.

I’m actually not sure where it got the “quickly “ from because that wasn’t in its logic chain. It added that right at the very end. It was actually quite close to a near perfect reverse translation in it’s reasoning chain.

2 Likes

Since we were discussing programming I had o1 think about every programming language it could think of (which turned out to be a lot) and list their pros and cons. Then I told it that in the future every person will be a programmer so we will need a more conversationally friendly way to create programs. I had o1 think about what the key features of such a language might be. Then I had it define the syntax for this new language which was extensive… finally I had it write Snake in that language which I will share:


Section: Initialization

Set the game width to 20.
Set the game height to 20.

Create a list called snake with the starting position at (10, 10).

Set the initial direction to "Right".

Create an empty list called snake direction queue.

Call place new food.

Section: Main Game Loop

While the game is running,

    Call handle user input.

    If the snake direction queue is not empty, then
        Set the direction to the first item in the snake direction queue.
        Remove the first item from the snake direction queue.

    Call update snake position.

    If the snake has collided with itself or the wall, then
        End the game.

    If the snake's head is at the same position as the food, then
        Call grow the snake.
        Call place new food.

    Call draw the game.
    Wait for a short period.

Section: Functions

To handle user input:
    If the user has pressed the up arrow and the direction is not "Down", then
        Add "Up" to the snake direction queue.
    If the user has pressed the down arrow and the direction is not "Up", then
        Add "Down" to the snake direction queue.
    If the user has pressed the left arrow and the direction is not "Right", then
        Add "Left" to the snake direction queue.
    If the user has pressed the right arrow and the direction is not "Left", then
        Add "Right" to the snake direction queue.

To update snake position:
    Set the current head position to the first item in the snake.
    If the direction is "Up", then
        Set the new head position to (current head position's X, current head position's Y minus 1).
    If the direction is "Down", then
        Set the new head position to (current head position's X, current head position's Y plus 1).
    If the direction is "Left", then
        Set the new head position to (current head position's X minus 1, current head position's Y).
    If the direction is "Right", then
        Set the new head position to (current head position's X plus 1, current head position's Y).

    Insert the new head position at the beginning of the snake.
    Remove the last item from the snake.

To grow the snake:
    Add the tail position back to the end of the snake.

To place new food:
    Randomly choose an X position between 1 and game width.
    Randomly choose a Y position between 1 and game height.
    Set the food position to (X position, Y position).
    If the food position is on the snake, then
        Call place new food again.

To check for collision:
    Set the head position to the first item in the snake.
    If the head position's X is less than 1 or greater than game width, then
        End the game.
    If the head position's Y is less than 1 or greater than game height, then
        End the game.
    For each segment in the rest of the snake,
        If the head position is the same as the segment's position, then
            End the game.

To draw the game:
    Clear the screen.
    For each position in the snake,
        Draw the snake segment at that position.
    Draw the food at the food position.
    Display the current score.

Section: Game Over

When the game ends,
    Display "Game Over!".
    Display "Your score is [length of the snake minus initial length].".
    Ask the user if they want to play again.
    If the user says "Yes", then
        Restart the game.
    Else
        Exit the game.

I actually find it highly readable and I quite like the way it deals with functions.

And here’s a 2 player game of battleship since we’ve seen snake a million times now:


Section: Initialization

Define the game board size as 10 rows by 10 columns.

For each player in Player 1 and Player 2,
    Create a game board for [player], filled with empty cells.
    Create a tracking board for [player] to record hits and misses.

Define the fleet of ships with their names and sizes:
    - Aircraft Carrier of size 5
    - Battleship of size 4
    - Submarine of size 3
    - Cruiser of size 3
    - Destroyer of size 2

For each player,
    Call place ships for [player].

Set the current player to Player 1.

Section: Main Game Loop

While neither player has lost all their ships,
    Clear the screen.
    Display the current player's tracking board.
    Display a separator line.
    Prompt the current player to enter a target coordinate to fire upon.
    Call process shot at the target coordinate against the opposing player.
    If the opposing player has lost all their ships, then
        Display "Congratulations, [current player]! You have won the game!"
        End the game.
    Else
        Call switch current player.

Section: Functions

To place ships for [player]:
    Display "[player], please place your ships on the game board."
    For each ship in the fleet,
        Repeat until the ship is placed successfully:
            Ask [player] to enter the starting coordinate and orientation (horizontal or vertical) for the [ship's name].
            If the placement is valid on [player]'s game board, then
                Place the [ship's name] at the specified location and orientation on [player]'s game board.
                Display "[ship's name] placed successfully."
            Else
                Display "Invalid placement. Please try again."
    Hide [player]'s game board from the other player.

To process shot at [coordinate] against [opposing player]:
    If the [coordinate] has already been targeted on the tracking board, then
        Display "You have already fired at this location. Please choose a different coordinate."
    Else
        If the [coordinate] contains part of a ship on [opposing player]'s game board, then
            Mark the [coordinate] as a hit on both the tracking board and [opposing player]'s game board.
            Display "Hit!"
            If the ship at [coordinate] has all its parts hit, then
                Display "You have sunk [opposing player]'s [ship's name]!"
        Else
            Mark the [coordinate] as a miss on the tracking board.
            Display "Miss!"

To switch current player:
    If the current player is Player 1, then
        Set the current player to Player 2.
    Else
        Set the current player to Player 1.

To check if [player] has lost all their ships:
    If all ships on [player]'s game board have all their parts hit, then
        Return true.
    Else
        Return false.

Section: Helper Functions

To validate ship placement for [ship] at [coordinate] with [orientation] on [player]'s game board:
    Determine the list of coordinates the ship would occupy.
    If any of the coordinates are off the board or overlap with another ship, then
        Return false.
    Else
        Return true.

To place [ship] at [coordinate] with [orientation] on [player]'s game board:
    For each coordinate the ship occupies,
        Mark the coordinate with the [ship's name] or identifier.

To get opposing player:
    If the current player is Player 1, then
        Return Player 2.
    Else
        Return Player 1.

Section: Game Over

When the game ends,
    Display "Game Over!"
    Ask the players if they want to play again.
    If they respond "Yes", then
        Restart the game.
    Else
        Exit the program.

I also had it create an 8 player variant of battleship and it altered everything to be basically last man (or woman) standing… you have to select the player you’re targeting when you fire but otherwise the same basic mechanics

UPDATE…

This is super interesting… this is basically pseudo code but its logic is constantly flawless. I asked it to create the game joust and it totally got all of the mechanics correct. It just glosses over things like it’ll say “apply gravity to the player knights velocity”.

It sort of makes me wonder if there’s not a model where you start with this pseudo code to get express the high level details of the program you want the model to write and then you just have it perform a transform to convert the pseudo code to the target language. I’m 95% certain that will work reasonably well.

3 Likes

I quite like Canvas for writing and brainstorming, but I don’t like that I can’t seem to delete individual pages.

The pixel data of the image is quite extensive…

I still want the full list I need it for a project!

Convert these pixel colors into a png

output_image

I just tried a brainstorming and it worked well, this is a nice tool outside of coding

image

I couldn’t delete the pages from the dropdown either but could clear them one by one :smiley:

… It’s a wise default topic ^^

From what I am seeing it seems to be an interface for preparing data to copy/paste to editors, you cant run js, you cant edit pictures etc but everything is a formatted copy/paste away

1 Like

Just also tried the following:

This worked… Kinda… Not sure how to implement this with the API though because return is a 200Kb file download?

Just like not understanding 3D space this is the same result

After a bunch of attempts in 4o I couldn’t get this to one shot…

Strawberry managed it first time ^^. Generating a .5 second wood chopping sound from code

waving_image

waving_grass_image

Lol this was canvas… waving grass animated .gif image from icon…

‘Create a very small very short video in a gif of a snowman’

snowman

image

Maybe I just hijacked it

create an image with 9 frames of a stick man walking

turn that into an animated gif

stickman_walking

Baby steps

3 Likes

I really like the tool for working on code. it’s quite effective! It would be even more powerful if it could support longer code snippets, allowing for more extensive coding sessions without limitations. Additionally, having a feature to highlight specific sections on the canvas would enhance the user experience. For large code edits, automatically navigating to the modified section or placing a marker on the line number to indicate changes would greatly improve the workflow. Implementing any of these features would be a significant enhancement to an already strong foundation. Keep up the great work!

2 Likes

yeah @anon79664085 that is the issue I see, sometimes it will update one line, other times it will just redo the whole thing. having it just take us to the lines that it changed and or even leave it highlighted after change would help narrow scope of changes incase it does the whole code again which can be hard to track all the changes if there are many. short of reading. although it does tell us what lines it edits if you ask.

3 Likes