Explicit vs implicit memory in code generation and other tasks

bruce.dambrosio · May 14, 2023, 3:47am

I apologize in advance for the confusing content of this post. I’m still thinking this through.

chatGPT using GPT-3.5-turbo is lousy at Tic Tac Toe. It starts out well, but quite quickly forgets whether it is playing X or O, and can’t tell if it has won or lost (GPT-4 does better on this example, but bear with me). I suspect that the problem is that the use of chat history requires gpt-3.5 to implicitly build and maintain an internal state representation at the sub-symbolic level.
It is widely recognized that reasoning (ie, process) can be improved when made explict at the symbolic level (eg, COT or REACT). I suggest (I’m probably not the first, I’d appreciate pointers) the same is true of state.

I have been exploring use of this idea in a code-generation context, in which the central concept is a task-specific state class. My target is a class of two-player activites, e.g. user and AI playing tic tac toe, or writing a blog post together, etc. The basic code generation flow is as follows:

create an process flow for the activity.
create a state class that stores the data needed to track the state of an instance of the activity as it progresses through the process flow
create an assistant class that has the methods and attributes needed to support the AI role in the process flow. It imports the state class, as well as a wrapper for gpt and any other tools provided.
create a driver class that steps through the process flow, invoking the user and assistant classes where they appear in the flow.

The first draft of the system has successfully generated python for Tic Tac Toe and Hangman (neither of which gpt-3.5-turbo can figure out - it thinks it knows how to play, but then quickly forgets whether it is playing X or O ). The Assistant class code calls gpt to choose its moves.

The state class, not surprisingly, contains things like the board representation, whose turn it is, etc. The state is a convenient, compact representation of the information gpt needs to be intelligent about move selection. Also, it provides the option to store current state of an activity (e.g. using pickle) and resume it at a later time.

That’s it. make state explicit. I’ve started to explore this as an element of code generation. I’m not sure what it would mean to try to make it part of a prompt…

iamflimflam1 · May 14, 2023, 4:04am

I played around a bit with this idea - it’s quite powerful. Have the model output it’s “memory” and inject it back into each prompt. I used it for playing 20 guesses and hangman.

Topic		Replies	Views
Proposing a Long-term Memory Mechanism for GPT Models Community chatgpt	5	2703	February 13, 2025
Prompting and memory question Prompting chatgpt	6	2070	December 18, 2023
How do you get GPT4 to generate code for larger scale problems? Prompting gpt-4	7	6493	December 17, 2023
A smarter chatbot with memory (idea) API	1	1524	December 17, 2023
Custom GPTs with Memory: What Are Your Thoughts? GPT builders chatgpt , gpts	1	198	November 15, 2024

Explicit vs implicit memory in code generation and other tasks

Related topics