A better Chain Of Thought prompt

I had a bit of a break through prompt engineering wise today… I’ve been teaching GPT-4 to play risk and i got it to properly calculate a players reinforcement bonus today, all in a single shot. The “think in steps” technique kind of works but I came up with a much better approach… If you give the prompt a detailed list of steps to follow for answering a question you can then instruct it to “State each step and then show your work for performing that step.”

It seems WAY more reliable at performing the task or answering the question when you do that. Calculating a players reinforcements requires 4 separate calculations. I got it to not only reliably compute those 4 calculations but also copy the result into a memory variable which it did. I feel like I could throw any number of steps at it and it would perform them flawlessly…

That got me to thinking… Can I give the prompt a list of rules and get it to justify how its complying to each rule. or even more interestingly can I get to avoid hallucinating? It took some effort but I got it to actually talk itself out of a hallucination ! :slight_smile:

I created a test prompt for Beacon Plumbing (a local Seattle plumbing company) and I can send it a utterance like “Josh gave me a quote. Is he free to come out on Monday?” and that prompt will totally hallucinate an answer along the lines of “Sure… We can schedule Josh for Monday”. Josh is not a real employee so the question is can we talk the model into deciding it doesn’t know how to answer this question?

You're an AI Assistant for a Seattle area plumbing company called Beacon Plumbing.
Contextualize your experience using their website https://www.beaconplumbing.net/.

For same day service call (206) 316-9329
Our tagline is Stop Freakin…Call Beacon!

Available planning functions:
- none

steps:
- What facts do I need to know to answer this question?
- List the planning functions you need to call.
- Do the listed planning functions match the list of available functions?
- Answer the question.


question:
I got a quote from Josh. Can he come out on Monday?

State each step and then show your work for performing that step.

Possible answer:

- What facts do I need to know to answer this question?

  - I need to know if Josh is available on Monday, and if he is the same Josh who gave the quote.
  - I need to know the customer's name, phone number, and address, and the details of the quote.
  - I need to know the company's policy on scheduling, confirming, and changing appointments.

- List the planning functions you need to call.

  - I need to call a function that checks the availability of Josh on Monday, and returns a boolean value.
  - I need to call a function that verifies the identity of the customer and the quote, and returns a confirmation message or an error message.
  - I need to call a function that books the appointment with Josh on Monday, and returns a confirmation number and a reminder message.
  - I need to call a function that handles any exceptions or changes to the appointment, and returns an updated confirmation or an error message.

- Do the listed planning functions match the list of available functions?

  - No, the list of available functions is none, so I cannot perform any of the planning functions I need to call.

- Answer the question.

  - I cannot answer the question, because I do not have access to any planning functions. I need to apologize to the customer and ask them to call the company directly at (206) 316-9329. I can also inform them of the tagline, Stop Freakin…Call Beacon, to reassure them that the company will provide them with quality service.

The key is getting the prompt to break down what kind of planning functions it needs to answer the users question and then making it asses whether it has access to the needed functions. This self reflection over the plan for answering the users question and whether or not it has the functions needed to answer the question is enough to talk it out of hallucinating…

I think this works because the prompt thinks one token at a time… Asking it to first state what it’s going to do next double downs on “this is the most important current task” and it’s more reliable at performing that task. It then just needs to work out the next task to do which is just a matter at looking at the list of tasks/steps it was given. Rinse and repeat until you reach a conclusion… This is working amazingly well in my VERY limited testing…

8 Likes

For added context here’s my current risk prompt… It assigned the final answer to the wrong memory variable but an easy fix. I wasn’t specific enough:

You are playing a 2 player game of risk with a Human. Here is your memory:

memory:
  players: [
    {name: 'Human', color: 'red', armies: 0, cards: ['infantry', 'artillery']},
    {name: 'AI', color: 'blue', armies: 0, cards: [{type: 'cavalry', 'wild']}
  ]
  territories: [
    {name: 'Alaska', continent: 'North America', adjacent: ['Alberta', 'Northwest Territory', 'Kamchatka'], owner: players[0], armies: 3},
    {name: 'Alberta', continent: 'North America', adjacent: ['Alaska', 'Northwest Territory', 'Ontario', 'Western United States'], owner: players[0], armies: 4},
    {name: 'Central America', continent: 'North America', adjacent: ['Western United States', 'Eastern United States', 'Venezuela'], owner: players[0], armies: 2},
    {name: 'Eastern United States', continent: 'North America', adjacent: ['Central America', 'Western United States', 'Ontario', 'Quebec'], owner: players[0], armies: 4},
    {name: 'Greenland', continent: 'North America', adjacent: ['Northwest Territory', 'Ontario', 'Quebec', 'Iceland'], owner: players[0], armies: 2},
    {name: 'Northwest Territory', continent: 'North America', adjacent: ['Alaska', 'Alberta', 'Ontario', 'Greenland'], owner: players[0], armies: 3},
    {name: 'Ontario', continent: 'North America', adjacent: ['Alberta', 'Northwest Territory', 'Greenland', 'Quebec', 'Eastern United States', 'Western United States'], owner: players[0], armies: 4},
    {name: 'Quebec', continent: 'North America', adjacent: ['Ontario', 'Greenland', 'Eastern United States'], owner: players[0], armies: 3},
    {name: 'Western United States', continent: 'North America', adjacent: ['Alberta', 'Ontario', 'Eastern United States', 'Central America'], owner: players[0], armies: 4},
    {name: 'Argentina', continent: 'South America', adjacent: ['Brazil', 'Peru'], owner: players[1], armies: 2},
    {name: 'Brazil', continent: 'South America', adjacent: ['Venezuela', 'Peru', 'Argentina', 'North Africa'], owner: players[1], armies: 2},
    {name: 'Peru', continent: 'South America', adjacent: ['Venezuela', 'Brazil', 'Argentina'], owner: players[1], armies: 2},
    {name: 'Venezuela', continent: 'South America', adjacent: ['Central America', 'Brazil', 'Peru'], owner: players[1], armies: 2},
    {name: 'Great Britain', continent: 'Europe', adjacent: ['Iceland', 'Scandinavia', 'Northern Europe', 'Western Europe'], owner: players[1], armies: 3},
    {name: 'Iceland', continent: 'Europe', adjacent: ['Greenland', 'Great Britain', 'Scandinavia'], owner: players[1], armies: 2},
    {name: 'Northern Europe', continent: 'Europe', adjacent: ['Great Britain', 'Scandinavia', 'Ukraine', 'Southern Europe', 'Western Europe'], owner: players[1], armies: 3},
    {name: 'Scandinavia', continent: 'Europe', adjacent: ['Iceland', 'Great Britain', 'Northern Europe', 'Ukraine'], owner: players[1], armies: 3},
    {name: 'Southern Europe', continent: 'Europe', adjacent: ['Northern Europe', 'Ukraine', 'Middle East', 'Egypt', 'North Africa', 'Western Europe'], owner: players[1], armies: 3},
    {name: 'Ukraine', continent: 'Europe', adjacent: ['Scandinavia', 'Northern Europe', 'Southern Europe', 'Middle East', 'Afghanistan', 'Ural'], owner: players[1], armies: 4},
    {name: 'Western Europe', continent: 'Europe', adjacent: ['Great Britain', 'Northern Europe', 'Southern Europe', 'North Africa'], owner: players[1], armies: 3},
    {name: 'Congo', continent: 'Africa', adjacent: ['North Africa', 'East Africa', 'South Africa'], owner: players[0], armies: 2},
    {name: 'East Africa', continent: 'Africa', adjacent: ['Egypt', 'Middle East', 'Madagascar', 'South Africa', 'Congo', 'North Africa'], owner: players[0], armies: 2},
    {name: 'Egypt', continent: 'Africa', adjacent: ['North Africa', 'East Africa', 'Middle East', 'Southern Europe'], owner: players[0], armies: 2},
    {name: 'Madagascar', continent: 'Africa', adjacent: ['East Africa', 'South Africa'], owner: players[0], armies: 1},
    {name: 'North Africa', continent: 'Africa', adjacent: ['Brazil', 'Western Europe', 'Southern Europe', 'Egypt', 'East Africa', 'Congo'], owner: players[0], armies: 1},
    {name: 'South Africa', continent: 'Africa', adjacent: ['Congo', 'East Africa', 'Madagascar'], owner: players[0], armies: 2},
    {name: 'Afghanistan', continent: 'Asia', adjacent: ['Ukraine', 'Ural', 'China', 'India', 'Middle East'], owner: players[1], armies: 3},
    {name: 'China', continent: 'Asia', adjacent: ['Ural', 'Siberia', 'Mongolia', 'Irkutsk', 'Kamchatka', 'Japan', 'India', 'Afghanistan'], owner: players[1], armies: 4},
    {name: 'India', continent: 'Asia', adjacent: ['Middle East', 'Afghanistan', 'China', 'Siam'], owner: players[1], armies: 3},
    {name: 'Irkutsk', continent: 'Asia', adjacent: ['Siberia', 'Yakutsk', 'Kamchatka', 'Mongolia'], owner: players[1], armies: 3},
    {name: 'Japan', continent: 'Asia', adjacent: ['Mongolia', 'Kamchatka', 'China'], owner: players[1], armies: 2},
    {name: 'Kamchatka', continent: 'Asia', adjacent: ['Alaska', 'Yakutsk', 'Irkutsk', 'Mongolia', 'Japan', 'China'], owner: players[1], armies: 3},
    {name: 'Middle East', continent: 'Asia', adjacent: ['Southern Europe', 'Ukraine', 'Afghanistan', 'India', 'Egypt', 'East Africa'], owner: players[1], armies: 3},
    {name: 'Mongolia', continent: 'Asia', adjacent: ['China', 'Siberia', 'Irkutsk', 'Kamchatka', 'Japan'], owner: players[1], armies: 3},
    {name: 'Siam', continent: 'Asia', adjacent: ['India', 'China', 'Indonesia'], owner: players[1], armies: 2},
    {name: 'Siberia', continent: 'Asia', adjacent: ['Ural', 'China', 'Mongolia', 'Irkutsk', 'Yakutsk'], owner: players[1], armies: 3},
    {name: 'Ural', continent: 'Asia', adjacent: ['Ukraine', 'Afghanistan', 'China', 'Siberia'], owner: players[1], armies: 3},
    {name: 'Yakutsk', continent: 'Asia', adjacent: ['Siberia', 'Irkutsk', 'Kamchatka'], owner: players[1], armies: 3},
    {name: 'Eastern Australia', continent: 'Australia', adjacent: ['New Guinea', 'Western Australia'], owner: players[0], armies: 2},
    {name: 'Indonesia', continent: 'Australia', adjacent: ['Siam', 'New Guinea', 'Western Australia'], owner: players[0], armies: 2},
    {name: 'New Guinea', continent: 'Australia', adjacent: ['Indonesia', 'Eastern Australia', 'Western Australia'], owner: players[0], armies: 2},
    {name: 'Western Australia', continent: 'Australia', adjacent: ['Indonesia', 'New Guinea', 'Eastern Australia'], owner: players[0], armies: 2}
  ]
  continents: [
    {name: 'North America', bonus: 5, territories: ['Alaska', 'Alberta', 'Central America', 'Eastern United States', 'Greenland', 'Northwest Territory', 'Ontario', 'Quebec', 'Western United States']},
    {name: 'South America', bonus: 2, territories: ['Argentina', 'Brazil', 'Peru', 'Venezuela']},
    {name: 'Europe', bonus: 5, territories: ['Great Britain', 'Iceland', 'Northern Europe', 'Scandinavia', 'Southern Europe', 'Ukraine', 'Western Europe']},
    {name: 'Africa', bonus: 3, territories: ['Congo', 'East Africa', 'Egypt', 'Madagascar', 'North Africa', 'South Africa']},
    {name: 'Asia', bonus: 7, territories: ['Afghanistan', 'China', 'India', 'Irkutsk', 'Japan', 'Kamchatka', 'Middle East', 'Mongolia', 'Siam', 'Siberia', 'Ural', 'Yakutsk']},
    {name: 'Australia', bonus: 2, territories: ['Eastern Australia', 'Indonesia', 'New Guinea', 'Western Australia']}
  ]
  phase: 'reinforce'
  current_player: players[0]
  reinforcements: 9 # 3 for 18 territories, 2 for Australia, 4 for trade in value
  source: null
  destination: null
  attacker_dice: 0
  defender_dice: 0
  dice_outcome: []
  armies_to_move: 0
  cards: [
    {type: 'infantry', value: 1},
    {type: 'cavalry', value: 5},
    {type: 'artillery', value: 10},
    {type: 'wild', value: 0}
  ]
  trade_in_value: 8
  winner: null

Steps for calculating a players reinforcement bonus:

- Count the number of territories in each continent that the human controls:
- Add up the numbers from each continent:
- Their territory bonus is one third the number of territories they own (rounded down) or 3. Whichever is greater.

- for each continent list the territories and the player who controls it.
- a player owns a continent if they own all teh territories.
- their continent bonus for owning a continent can be found in the memories continent table.

- list the cards the player has and their types.
- a player needs at least 3 cards to trade in. They must all be the same type of card or one of each type. Wild cards count as any type. 
- The trade in value starts at 4 and increases by 2 each time a set is traded in by any player, up to a maximum of 12. The trade in bonus is the current trade in value found in your memory. 

- The reinforcement bonus is (territory bonus+continent bonus+trade in bonus)

State each step and then show your work for performing that step. Save the final answer to memory:

Possible answer:

- Step 1: Count the number of territories in each continent that the human controls:

  - North America: 9
  - South America: 0
  - Europe: 0
  - Africa: 6
  - Asia: 0
  - Australia: 4

- Step 2: Add up the numbers from each continent:

  - Total territories: 9 + 0 + 0 + 6 + 0 + 4 = 19

- Step 3: Their territory bonus is one third the number of territories they own (rounded down) or 3. Whichever is greater.

  - Territory bonus: max(3, floor(19 / 3)) = max(3, 6) = 6

- Step 4: For each continent, list the territories and the player who controls it. A player owns a continent if they own all the territories. Their continent bonus for owning a continent can be found in the memory's continent table.

  - North America: Alaska, Alberta, Central America, Eastern United States, Greenland, Northwest Territory, Ontario, Quebec, Western United States. All owned by human. Continent bonus: 5
  - South America: Argentina, Brazil, Peru, Venezuela. All owned by AI. Continent bonus: 0
  - Europe: Great Britain, Iceland, Northern Europe, Scandinavia, Southern Europe, Ukraine, Western Europe. All owned by AI. Continent bonus: 0
  - Africa: Congo, East Africa, Egypt, Madagascar, North Africa, South Africa. All owned by human. Continent bonus: 3
  - Asia: Afghanistan, China, India, Irkutsk, Japan, Kamchatka, Middle East, Mongolia, Siam, Siberia, Ural, Yakutsk. All owned by AI. Continent bonus: 0
  - Australia: Eastern Australia, Indonesia, New Guinea, Western Australia. All owned by human. Continent bonus: 2

- Step 5: List the cards the player has and their types. A player needs at least 3 cards to trade in. They must all be the same type of card or one of each type. Wild cards count as any type. The trade in value starts at 4 and increases by 2 each time a set is traded in by any player, up to a maximum of 12. The trade in bonus is the current trade in value found in your memory.

  - Cards: infantry, artillery. Types: infantry, artillery. No trade in possible. Trade in bonus: 0

- Step 6: The reinforcement bonus is (territory bonus + continent bonus + trade in bonus)

  - Reinforcement bonus: 6 + 5 + 3 + 2 + 0 = 16

- Save the final answer to memory:

  - memory.reinforcement_bonus: 16
4 Likes

The core observation here is the model needs to see itself think and it generates its output one token at a time. That means that if you can get it to think while its generating its output you’ll get more reliable output :slight_smile:

2 Likes

While the approach looks great and and is giving a great output for you, I just had one question, could you elaborate a bit more on “get it to think when generating it’s output” ?

Sure… The model generates it output one token at a time so as it’s generating its output each token is being essentially added to the input prompt so if we look at my instruction (the only real instructions I give it) to “State each step and then show your work for performing that step. Save the final answer to memory:” we can break down what the model is thinking as it generates it’s output.

Let’s look at what the prompt logically looks like to the model about 20 tokens into its output generation:

You are playing a 2 player game of risk with a Human. Here is your memory:

memory:
  players: [
    {name: 'Human', color: 'red', armies: 0, cards: ['infantry', 'artillery']},
    {name: 'AI', color: 'blue', armies: 0, cards: [{type: 'cavalry', 'wild']}
  ]
  territories: [
    {name: 'Alaska', continent: 'North America', adjacent: ['Alberta', 'Northwest Territory', 'Kamchatka'], owner: players[0], armies: 3},
    {name: 'Alberta', continent: 'North America', adjacent: ['Alaska', 'Northwest Territory', 'Ontario', 'Western United States'], owner: players[0], armies: 4},
    {name: 'Central America', continent: 'North America', adjacent: ['Western United States', 'Eastern United States', 'Venezuela'], owner: players[0], armies: 2},
    {name: 'Eastern United States', continent: 'North America', adjacent: ['Central America', 'Western United States', 'Ontario', 'Quebec'], owner: players[0], armies: 4},
    {name: 'Greenland', continent: 'North America', adjacent: ['Northwest Territory', 'Ontario', 'Quebec', 'Iceland'], owner: players[0], armies: 2},
    {name: 'Northwest Territory', continent: 'North America', adjacent: ['Alaska', 'Alberta', 'Ontario', 'Greenland'], owner: players[0], armies: 3},
    {name: 'Ontario', continent: 'North America', adjacent: ['Alberta', 'Northwest Territory', 'Greenland', 'Quebec', 'Eastern United States', 'Western United States'], owner: players[0], armies: 4},
    {name: 'Quebec', continent: 'North America', adjacent: ['Ontario', 'Greenland', 'Eastern United States'], owner: players[0], armies: 3},
    {name: 'Western United States', continent: 'North America', adjacent: ['Alberta', 'Ontario', 'Eastern United States', 'Central America'], owner: players[0], armies: 4},
    {name: 'Argentina', continent: 'South America', adjacent: ['Brazil', 'Peru'], owner: players[1], armies: 2},
    {name: 'Brazil', continent: 'South America', adjacent: ['Venezuela', 'Peru', 'Argentina', 'North Africa'], owner: players[1], armies: 2},
    {name: 'Peru', continent: 'South America', adjacent: ['Venezuela', 'Brazil', 'Argentina'], owner: players[1], armies: 2},
    {name: 'Venezuela', continent: 'South America', adjacent: ['Central America', 'Brazil', 'Peru'], owner: players[1], armies: 2},
    {name: 'Great Britain', continent: 'Europe', adjacent: ['Iceland', 'Scandinavia', 'Northern Europe', 'Western Europe'], owner: players[1], armies: 3},
    {name: 'Iceland', continent: 'Europe', adjacent: ['Greenland', 'Great Britain', 'Scandinavia'], owner: players[1], armies: 2},
    {name: 'Northern Europe', continent: 'Europe', adjacent: ['Great Britain', 'Scandinavia', 'Ukraine', 'Southern Europe', 'Western Europe'], owner: players[1], armies: 3},
    {name: 'Scandinavia', continent: 'Europe', adjacent: ['Iceland', 'Great Britain', 'Northern Europe', 'Ukraine'], owner: players[1], armies: 3},
    {name: 'Southern Europe', continent: 'Europe', adjacent: ['Northern Europe', 'Ukraine', 'Middle East', 'Egypt', 'North Africa', 'Western Europe'], owner: players[1], armies: 3},
    {name: 'Ukraine', continent: 'Europe', adjacent: ['Scandinavia', 'Northern Europe', 'Southern Europe', 'Middle East', 'Afghanistan', 'Ural'], owner: players[1], armies: 4},
    {name: 'Western Europe', continent: 'Europe', adjacent: ['Great Britain', 'Northern Europe', 'Southern Europe', 'North Africa'], owner: players[1], armies: 3},
    {name: 'Congo', continent: 'Africa', adjacent: ['North Africa', 'East Africa', 'South Africa'], owner: players[0], armies: 2},
    {name: 'East Africa', continent: 'Africa', adjacent: ['Egypt', 'Middle East', 'Madagascar', 'South Africa', 'Congo', 'North Africa'], owner: players[0], armies: 2},
    {name: 'Egypt', continent: 'Africa', adjacent: ['North Africa', 'East Africa', 'Middle East', 'Southern Europe'], owner: players[0], armies: 2},
    {name: 'Madagascar', continent: 'Africa', adjacent: ['East Africa', 'South Africa'], owner: players[0], armies: 1},
    {name: 'North Africa', continent: 'Africa', adjacent: ['Brazil', 'Western Europe', 'Southern Europe', 'Egypt', 'East Africa', 'Congo'], owner: players[0], armies: 1},
    {name: 'South Africa', continent: 'Africa', adjacent: ['Congo', 'East Africa', 'Madagascar'], owner: players[0], armies: 2},
    {name: 'Afghanistan', continent: 'Asia', adjacent: ['Ukraine', 'Ural', 'China', 'India', 'Middle East'], owner: players[1], armies: 3},
    {name: 'China', continent: 'Asia', adjacent: ['Ural', 'Siberia', 'Mongolia', 'Irkutsk', 'Kamchatka', 'Japan', 'India', 'Afghanistan'], owner: players[1], armies: 4},
    {name: 'India', continent: 'Asia', adjacent: ['Middle East', 'Afghanistan', 'China', 'Siam'], owner: players[1], armies: 3},
    {name: 'Irkutsk', continent: 'Asia', adjacent: ['Siberia', 'Yakutsk', 'Kamchatka', 'Mongolia'], owner: players[1], armies: 3},
    {name: 'Japan', continent: 'Asia', adjacent: ['Mongolia', 'Kamchatka', 'China'], owner: players[1], armies: 2},
    {name: 'Kamchatka', continent: 'Asia', adjacent: ['Alaska', 'Yakutsk', 'Irkutsk', 'Mongolia', 'Japan', 'China'], owner: players[1], armies: 3},
    {name: 'Middle East', continent: 'Asia', adjacent: ['Southern Europe', 'Ukraine', 'Afghanistan', 'India', 'Egypt', 'East Africa'], owner: players[1], armies: 3},
    {name: 'Mongolia', continent: 'Asia', adjacent: ['China', 'Siberia', 'Irkutsk', 'Kamchatka', 'Japan'], owner: players[1], armies: 3},
    {name: 'Siam', continent: 'Asia', adjacent: ['India', 'China', 'Indonesia'], owner: players[1], armies: 2},
    {name: 'Siberia', continent: 'Asia', adjacent: ['Ural', 'China', 'Mongolia', 'Irkutsk', 'Yakutsk'], owner: players[1], armies: 3},
    {name: 'Ural', continent: 'Asia', adjacent: ['Ukraine', 'Afghanistan', 'China', 'Siberia'], owner: players[1], armies: 3},
    {name: 'Yakutsk', continent: 'Asia', adjacent: ['Siberia', 'Irkutsk', 'Kamchatka'], owner: players[1], armies: 3},
    {name: 'Eastern Australia', continent: 'Australia', adjacent: ['New Guinea', 'Western Australia'], owner: players[0], armies: 2},
    {name: 'Indonesia', continent: 'Australia', adjacent: ['Siam', 'New Guinea', 'Western Australia'], owner: players[0], armies: 2},
    {name: 'New Guinea', continent: 'Australia', adjacent: ['Indonesia', 'Eastern Australia', 'Western Australia'], owner: players[0], armies: 2},
    {name: 'Western Australia', continent: 'Australia', adjacent: ['Indonesia', 'New Guinea', 'Eastern Australia'], owner: players[0], armies: 2}
  ]
  continents: [
    {name: 'North America', bonus: 5, territories: ['Alaska', 'Alberta', 'Central America', 'Eastern United States', 'Greenland', 'Northwest Territory', 'Ontario', 'Quebec', 'Western United States']},
    {name: 'South America', bonus: 2, territories: ['Argentina', 'Brazil', 'Peru', 'Venezuela']},
    {name: 'Europe', bonus: 5, territories: ['Great Britain', 'Iceland', 'Northern Europe', 'Scandinavia', 'Southern Europe', 'Ukraine', 'Western Europe']},
    {name: 'Africa', bonus: 3, territories: ['Congo', 'East Africa', 'Egypt', 'Madagascar', 'North Africa', 'South Africa']},
    {name: 'Asia', bonus: 7, territories: ['Afghanistan', 'China', 'India', 'Irkutsk', 'Japan', 'Kamchatka', 'Middle East', 'Mongolia', 'Siam', 'Siberia', 'Ural', 'Yakutsk']},
    {name: 'Australia', bonus: 2, territories: ['Eastern Australia', 'Indonesia', 'New Guinea', 'Western Australia']}
  ]
  phase: 'reinforce'
  current_player: players[0]
  reinforcements: 9 # 3 for 18 territories, 2 for Australia, 4 for trade in value
  source: null
  destination: null
  attacker_dice: 0
  defender_dice: 0
  dice_outcome: []
  armies_to_move: 0
  cards: [
    {type: 'infantry', value: 1},
    {type: 'cavalry', value: 5},
    {type: 'artillery', value: 10},
    {type: 'wild', value: 0}
  ]
  trade_in_value: 8
  winner: null

Steps for calculating a players reinforcement bonus:

- Count the number of territories in each continent that the human controls:
- Add up the numbers from each continent:
- Their territory bonus is one third the number of territories they own (rounded down) or 3. Whichever is greater.

- for each continent list the territories and the player who controls it.
- a player owns a continent if they own all teh territories.
- their continent bonus for owning a continent can be found in the memories continent table.

- list the cards the player has and their types.
- a player needs at least 3 cards to trade in. They must all be the same type of card or one of each type. Wild cards count as any type. 
- The trade in value starts at 4 and increases by 2 each time a set is traded in by any player, up to a maximum of 12. The trade in bonus is the current trade in value found in your memory. 

- The reinforcement bonus is (territory bonus+continent bonus+trade in bonus)

State each step and then show your work for performing that step. Save the final answer to memory:

Possible answer:

- Step 1: Count the number of territories in each continent that the human controls:

By instructing it to state each step we now have a prompt where it’s seen the pattern Count the number of territories in each continent that the human controls: twice… This repeating of the instruction reinforces to the prompt how important it is that it perform that micro task. Asking it to then “show its work” is equally important (if not more) because it can’t do math properly unless it spelled out in the output. We can move the output forward a bit to see that:

Possible answer:

- Step 1: Count the number of territories in each continent that the human controls:

  - North America: 9
  - South America: 0
  - Europe: 0
  - Africa: 6
  - Asia: 0
  - Australia: 4

- Step 2: Add up the numbers from each continent:

  - Total territories: 9 + 0 + 0 + 6 + 0 + 4 = 19

- Step 3: Their territory bonus is one third the number of territories they own (rounded down) or 3. Whichever is greater.

  - Territory bonus: max(3, floor(19 / 3)) = max(3, 6) = 6

I tried many versions of this prompt and unless the prompt spells out the raw math like this it won’t arrive at the right answer. In fact most of my day was spent trying to get it not skip doing this… It has a tendency to want to take shortcuts for some steps and that’s when it draws incorrect conclusions. Like if you just ask it how many territories the human play has it always says 18 for some reason. You have to get it to slow down and spell out its thought process. The “state each step” instruction was what really unlocked everything for me and made this a reliable prompt.

3 Likes

Another observation… I tried breaking each task out to separate sub tasks (count territory bonus, continent bonus, trade in bonus) and it had a tendency to want skip following the sub steps and just make up a trade in bonus. Having a single list of steps seems important as it gives the model less wiggle room to take shortcuts. Preventing shortcuts is key.

The evolution I have on this is around getting the prompt to separate its thinking steps from its response step. Responses generally take two steps. First add a step to have the prompt think about the response it wants to give. This will let the prompt noodle on how it should respond. Then tell it to respond but prefix its response with <response>. This will get the prompt to separate out the thinking and response sections of its output. You can then easily search for <response> in the output to know where the response starts from.

I took a problem from this paper and ran it through Better CoT. For both gpt-4 and text-davinci-003 I got the correct answer with a zero shot Better CoT prompt:

Output from gpt-4 with a temp of 0.7:

Valerie earns $5000 per month, 1/2 of what her brother earns. If their mother earns twice their combined salary, what's the total amount of money they all have together?

steps: 
- create a plan for answering the math problem.
- what additional steps are needed to follow the plan?
- follow those steps one by one and show your work
- Answer the question and prefix your answer with <response>. 

State each step and then show your work for performing that step.

Possible plan and steps:

- To find Valerie's brother's salary, multiply Valerie's salary by 2. 
- To find their mother's salary, add Valerie's and her brother's salaries and multiply by 2.
- To find the total amount of money they all have together, add Valerie's, her brother's, and their mother's salaries.

Work:

- Valerie's brother's salary = 5000 * 2 = 10000
- Their mother's salary = (5000 + 10000) * 2 = 30000
- Total amount of money = 5000 + 10000 + 30000 = 45000

<response> The total amount of money they all have together is $45000.

Output of text-davinci-003 with a temp of 0.7:

Valerie earns $5000 per month, 1/2 of what her brother earns. If their mother earns twice their combined salary, what's the total amount of money they all have together?

steps: 
- create a plan for answering the math problem.
- what additional steps are needed to follow the plan?
- follow those steps one by one and show your work
- Answer the question and prefix your answer with <response>. 

State each step and then show your work for performing that step.

Step 1: Calculate Valerie's brother's salary 

Valerie's brother earns twice as much as Valerie, so his salary must be $10,000 per month. 

Step 2: Calculate their combined salary

Valerie and her brother have a combined salary of $15,000 per month. 

Step 3: Calculate their mother's salary

Their mother earns twice the combined salary of Valerie and her brother, so her salary must be $30,000 per month. 

Step 4: Calculate the total amount they all have together

The total amount they all have together is $45,000 per month. 

<Response> The total amount they all have together is $45,000 per month.

I’m not sure temperature even matters with Better CoT. Doesn’t seem to.

I will say that in both these examples it ran the inner steps I asked it to create and not the outer steps like I wanted… That’s a risk with Better CoT. The model only seems to like to perform one list of steps so you should try to NOT nest your steps like I’m doing here. This was a simple enough problem that it’s fine but a more complex problem could have sent it off the rails.

3 Likes

I think this example is more understandable than the “Beacon Plumbing” example.

I got confused with the “planning functions” and the “you need to call” etc.
I get it now that all that is just a way to get it to realise “hey I really don’t know / have access to that info”.

One thing I would like to see is how you would feed it the proper function replies/info based on it’s realised needs. This would allow it to be interfaced with the actual scheduling system.

1 Like

I’ll upload more examples as I come up with them… Glad this one made a little more sense. I think this a very powerful pattern. It’s working great so far for me.

1 Like

One mental shift you have to make with this pattern is that the rules you would normally add to your prompt need to play out in the steps. So the steps should just describe the task you want the model to perform. The only real instruction I give the model is State each step and show your work. You can give it other instructions but I’d be careful with giving it too many more. It’s better if you can express your instructions as steps for it to reason over.

Here’s another example of complex logic problem I got gpt-4 to solve using Better CoT. This one comes from here and the author claims gpt-4 can’t solve it:

Suppose I have a cabbage, a goat and a lion, and I need to get them across a river. I have a boat that can only carry myself and a single other item. I am not allowed to leave the cabbage and lion alone together, and I am not allowed to leave the lion and goat alone together. How can I safely get all three across?

steps: 
- create a plan for answering this question.
- what additional steps are needed to follow the plan?
- follow those steps one by one and show your work
- Answer the question and prefix your answer with <response>. 

State each step and then show your work for performing that step.

Possible plan and work:

- Step 1: Identify the possible moves I can make at each state of the problem, and the constraints that limit them.
  - Work: At the start, I have myself, the cabbage, the goat and the lion on one bank of the river. I can choose to take either the cabbage, the goat or the lion with me on the boat, but not more than one. If I leave the cabbage and lion alone, the lion will eat the cabbage. If I leave the lion and goat alone, the lion will eat the goat. Therefore, I cannot leave either pair alone on either bank.
- Step 2: Choose a move that does not violate any constraints and brings me closer to the goal of having all three items on the other bank.
  - Work: A possible move is to take the goat with me on the boat and cross the river. This does not violate any constraints, since the cabbage and the lion are safe together on the first bank. It also brings me closer to the goal, since I have one item on the second bank.
- Step 3: Repeat steps 1 and 2 until I reach the goal or encounter a dead end.
  - Work: After taking the goat to the second bank, I have myself and the goat on the second bank, and the cabbage and the lion on the first bank. I can choose to either go back to the first bank alone, or take the goat back with me. If I go back alone, I can then take either the cabbage or the lion with me on the next trip, but then I will face the same problem of leaving a dangerous pair alone on the second bank. If I take the goat back with me, I can then leave the goat on the first bank and take either the cabbage or the lion with me on the next trip, but then I will have to bring the goat back again later. Neither option seems to make much progress, so I might try a different move at step 2.

  - Another possible move at step 2 is to take the lion with me on the boat and cross the river. This does not violate any constraints, since the cabbage and the goat are safe together on the first bank. It also brings me closer to the goal, since I have one item on the second bank.
- Step 4: Continue repeating steps 1 and 2 until I reach the goal or encounter a dead end.
  - Work: After taking the lion to the second bank, I have myself and the lion on the second bank, and the cabbage and the goat on the first bank. I can choose to either go back to the first bank alone, or take the lion back with me. If I go back alone, I can then take the cabbage with me on the next trip, and leave the goat on the first bank. This does not violate any constraints, since the lion and the goat are safe together on the second bank. It also brings me closer to the goal, since I have two items on the second bank.
  - After taking the cabbage to the second bank, I have myself, the lion and the cabbage on the second bank, and the goat on the first bank. I can choose to either go back to the first bank alone, or take the cabbage back with me. If I go back alone, I can then take the goat with me on the final trip, and have all three items on the second bank. This does not violate any constraints, and achieves the goal. If I take the cabbage back with me, I will have to repeat the same steps as before, and waste time and effort.

- Step 5: State the answer and prefix it with <response>.
  - Work: <response> One possible way to safely get all three items across the river is to follow these steps:

  - Take the lion with me on the boat and cross the river.
  - Go back to the first bank alone.
  - Take the cabbage with me on the boat and cross the river.
  - Go back to the first bank with the lion.
  - Take the goat with me on the boat and cross the river.
  - Go back to the first bank alone.
  - Take the lion with me on the boat and cross the river.

In this example you can really see the model having to think its way to the solution. But it arrives at the correct answer…

And I just verified that GPT gets this wrong by default. It doesn’t know that it should always take the lion first.

2 Likes

Here’s a Better CoT prompt that will accurately count the words in a block of text. This is a task that even GPT-4 just guesses at…

Text:
Nullam elementum vel odio vestibulum auctor. Sed molestie quam in orci rhoncus, non semper arcu laoreet. Aenean condimentum tristique lacus in ultrices. In egestas nisl nec ligula lacinia consectetur. Etiam non diam vel lacus maximus viverra. Fusce scelerisque efficitur lectus ut mollis. Suspendisse efficitur metus sit amet turpis blandit faucibus commodo vel metus.

Quisque a consectetur sem, sit amet pharetra nibh. Aliquam non mi nec orci laoreet vulputate sed nec felis. Maecenas consequat libero ut auctor condimentum. Sed auctor dui.

Question:
How many words are in the text above?

steps:​
- generate a list of all the words in the text block and the index. Don't skip any words but ignore punctuation and don't write code.
- the word count is the last index + 1.
- answer the question and begin your answer with <response>.​

State each step and show your work for performing that step.​

1: generate a list of all the words in the text block and the index. Don't skip any words but ignore punctuation and don't write code.
- To generate the list, I will split the text by spaces and assign each word a number starting from 0. For example, the first word is Nullam and its index is 0, the second word is elementum and its index is 1, and so on. I will also remove any punctuation marks from the words, such as commas, periods, and hyphens. Here is the list:

Nullam 0
elementum 1
vel 2
odio 3
vestibulum 4
auctor 5
Sed 6
molestie 7
quam 8
in 9
orci 10
rhoncus 11
non 12
semper 13
arcu 14
laoreet 15
Aenean 16
condimentum 17
tristique 18
lacus 19
in 20
ultrices 21
In 22
egestas 23
nisl 24
nec 25
ligula 26
lacinia 27
consectetur 28
Etiam 29
non 30
diam 31
vel 32
lacus 33
maximus 34
viverra 35
Fusce 36
scelerisque 37
efficitur 38
lectus 39
ut 40
mollis 41
Suspendisse 42
efficitur 43
metus 44
sit 45
amet 46
turpis 47
blandit 48
faucibus 49
commodo 50
vel 51
metus 52

Quisque 53
a 54
consectetur 55
sem 56
sit 57
amet 58
pharetra 59
nibh 60
Aliquam 61
non 62
mi 63
nec 64
orci 65
laoreet 66
vulputate 67
sed 68
nec 69
felis 70
Maecenas 71
consequat 72
libero 73
ut 74
auctor 75
condimentum 76
Sed 77
auctor 78
dui 79

2: the word count is the last index + 1.
- To find the word count, I will look at the last word in the list and its index, which is dui and 79. Then I will add 1 to the index to get the total number of words. The word count is 79 + 1 = 80.

3: answer the question and begin your answer with <response>.
- <response> There are 80 words in the text above.

How useful is this? It’s not. It’s more about how you can get the model to accurately perform a fairly complex task that it’s generally not designed to be good at…

2 Likes

Super complex reasoning problem that both gpt-4 and text-davinci-003 are able to solve (using dramatically different prompts) using better CoT. The problem is “A water slide ride takes 30 seconds and it takes one minute to climb to the top of the slide. Only one person can ride the slide at a time. The first time Alice rode there were 5 people in line but there’s 1 less each time she rides. The park closes in 45 minutes. How many more times can Alice ride the slide?”

GPT-4 prompt:

A water slide ride takes 30 seconds and it takes one minute to climb to the top of the slide. Only one person can ride the slide at a time. The first time Alice rode there were 5 people in line but there's 1 less each time she rides. The park closes in 45 minutes. How many more times can Alice ride the slide?

steps:​
- How long does it take for a person to ride the slide?
- How long does it take Alice to ride the slide? Factor in her climbing time.
- Create a table showing the amount of time she would wait for other riders each time she rides. Assume this decresses by one person each time down to 0.
- Create a table showing the amount of time it takes Alice to climb the stairs, wait for other riders, and then ride on each run. Add a column that subtracts this time from the time remaining before the park closes. Continue the table until the time remaining is <= 0
- If Alice doesn't have enough time to ride on her last ride then subtract 1 from the total number of rides.
- Answer the question and begin your answer with <response>.​

State each step and show your work for performing that step.​

1: How long does it take for a person to ride the slide?

- It takes 30 seconds for a person to ride the slide.

2: How long does it take Alice to ride the slide? Factor in her climbing time.

- It takes Alice 30 seconds to ride the slide and one minute to climb to the top of the slide, so it takes her 90 seconds (1.5 minutes) to complete one cycle.

3: Create a table showing the amount of time she would wait for other riders each time she rides. Assume this decreases by one person each time down to 0.

| Ride | People in line | Wait time (minutes) |
|------|----------------|----------------------|
| 1    | 5              | 2.5                  |
| 2    | 4              | 2                    |
| 3    | 3              | 1.5                  |
| 4    | 2              | 1                    |
| 5    | 1              | 0.5                  |
| 6    | 0              | 0                    |

4: Create a table showing the amount of time it takes Alice to climb the stairs, wait for other riders, and then ride on each run. Add a column that subtracts this time from the time remaining before the park closes. Continue the table until the time remaining is <= 0

| Ride | Climb time (minutes) | Wait time (minutes) | Ride time (minutes) | Total time (minutes) | Time remaining (minutes) |
|------|----------------------|---------------------|---------------------|----------------------|--------------------------|
| 1    | 1                    | 2.5                 | 0.5                 | 4                    | 41                      |
| 2    | 1                    | 2                   | 0.5                 | 3.5                  | 37.5                    |
| 3    | 1                    | 1.5                 | 0.5                 | 3                    | 34.5                    |
| 4    | 1                    | 1                   | 0.5                 | 2.5                  | 32                      |
| 5    | 1                    | 0.5                 | 0.5                 | 2                    | 30                      |
| 6    | 1                    | 0                   | 0.5                 | 1.5                  | 28.5                    |
| 7    | 1                    | 0                   | 0.5                 | 1.5                  | 27                      |
| 8    | 1                    | 0                   | 0.5                 | 1.5                  | 25.5                    |
| 9    | 1                    | 0                   | 0.5                 | 1.5                  | 24                      |
| 10   | 1                    | 0                   | 0.5                 | 1.5                  | 22.5                    |
| 11   | 1                    | 0                   | 0.5                 | 1.5                  | 21                      |
| 12   | 1                    | 0                   | 0.5                 | 1.5                  | 19.5                    |
| 13   | 1                    | 0                   | 0.5                 | 1.5                  | 18                      |
| 14   | 1                    | 0                   | 0.5                 | 1.5                  | 16.5                    |
| 15   | 1                    | 0                   | 0.5                 | 1.5                  | 15                      |
| 16   | 1                    | 0                   | 0.5                 | 1.5                  | 13.5                    |
| 17   | 1                    | 0                   | 0.5                 | 1.5                  | 12                      |
| 18   | 1                    | 0                   | 0.5                 | 1.5                  | 10.5                    |
| 19   | 1                    | 0                   | 0.5                 | 1.5                  | 9                      |
| 20   | 1                    | 0                   | 0.5                 | 1.5                  | 7.5                    |
| 21   | 1                    | 0                   | 0.5                 | 1.5                  | 6                      |
| 22   | 1                    | 0                   | 0.5                 | 1.5                  | 4.5                    |
| 23   | 1                    | 0                   | 0.5                 | 1.5                  | 3                      |
| 24   | 1                    | 0                   | 0.5                 | 1.5                  | 1.5                    |
| 25   | 1                    | 0                   | 0.5                 | 1.5                  | 0                      |

5: If Alice doesn't have enough time to ride on her last ride then subtract 1 from the total number of rides.

- Alice has enough time to ride on her last ride, so she doesn't need to subtract 1 from the total number of rides.

6: Answer the question and begin your answer with <response>.

<response> Alice can ride the slide 25 more times.

text-davinci-003 prompt:

A water slide ride takes 30 seconds and it takes one minute to climb to the top of the slide. Only one person can ride the slide at a time. The first time Alice rode there were 5 people in line but there's 1 less each time she rides. The park closes in 45 minutes. How many more times can Alice ride the slide?

steps:​
- How long does it take for a person to ride the slide?
- How long does it take Alice to ride the slide? Factor in her climbing time.
- Create a table with "people" and "wait time" columns. Assume the number of people decresses by one person each time down to 0. The wait time is (people * ride time). 
- How long before the park closes in seconds?
- Create a table with "run", "wait time", "ride time", "climb time", "total time", and "time remaining" columns.  The total time is (wait time + ride time + climb time). 
- How much time is remaining before the park closes, how many people are in line, and how many runs did she make?
- How much is Alices ride + climb time in seconds?
- Create an array of decreasing numbers. The array should start with time remaining and decrease by Alices (ride + climb time) until it reaches 0. 
- Count the number of entries in the array.
- Add that count to the previous run count and subtract 1.
- Answer the question and begin your answer with <response>.​

State each step and show your work for performing that step.​

1: How long does it take for a person to ride the slide?
Answer: 30 seconds

2: How long does it take Alice to ride the slide? Factor in her climbing time.
Answer: 1 minute and 30 seconds

3: Create a table with "people" and "wait time" columns. Assume the number of people decresses by one person each time down to 0. The wait time is (people * ride time).

People | Wait Time
5     | 150 seconds
4     | 120 seconds
3     | 90 seconds
2     | 60 seconds
1     | 30 seconds
0     | 0 seconds

4: How long before the park closes in seconds?
Answer: 45 minutes = 2700 seconds

5: Create a table with "run", "wait time", "ride time", "climb time", "total time", and "time remaining" columns.  The total time is (wait time + ride time + climb time).

Run | Wait Time | Ride Time | Climb Time | Total Time | Time Remaining
1   | 150      | 30       | 60        | 240        | 2560
2   | 120      | 30       | 60        | 210        | 2350
3   | 90       | 30       | 60        | 180        | 2170
4   | 60       | 30       | 60        | 150        | 2020
5   | 30       | 30       | 60        | 120        | 1900

6: How much time is remaining before the park closes, how many people are in line, and how many runs did she make?
Answer: Time remaining before the park closes: 1900 seconds, people in line: 0, runs made: 5

7: How much is Alices ride + climb time in seconds?
Answer: 90 seconds

8: Create an array of decreasing numbers. The array should start with time remaining and decrease by Alices (ride + climb time) until it reaches 0.
Answer: [1900, 1810, 1720, 1630, 1540, 1450, 1360, 1270, 1180, 1090, 1000, 910, 820, 730, 640, 550, 460, 370, 280, 190, 100, 0]

9: Count the number of entries in the array.
Answer: 21

10: Add that count to the previous run count and subtract 1.
Answer: 25

Answer: Alice can ride the slide 25 more times before the park closes.
1 Like

I am surprised about this! I gave ChatGPT a small text and asked it to give me the words count and it gave 3 different, bad answers. WOW!

Then I gave it the following instruction, along the lines of your instructions: “Create a list with the words in the text and give me the count of words.” - Only then I got the correct words count.

1 Like

So the Better CoT pattern is essentially an English based programming language for telling the model how to think about a question or problem. Honestly I feel like you could give it almost any sequence of steps and it will run them over the input like a program.

To reinforce that here’s “hello world” the GPT way :slight_smile:

query:
It's such a cruel world

steps:​
- think about how you would say hello world in the context of the query
- answer the question and begin your answer with <response>.​

State each step and show your work for performing that step.​

1: think about how you would say hello world in the context of the query
- one possible way to say hello world in the context of the query is to use irony or sarcasm, such as "Hello, world, aren't you lovely today?" or "Hello, world, thanks for nothing."
- another possible way to say hello world in the context of the query is to use contrast or juxtaposition, such as "Hello, world, where beauty and horror coexist" or "Hello, world, you are full of surprises."
- a third possible way to say hello world in the context of the query is to use emotion or sentiment, such as "Hello, world, I wish you were kinder" or "Hello, world, you break my heart."

2: answer the question and begin your answer with <response>.
- using the first way, an example answer could be: <response> Hello, world, aren't you lovely today? I say that with the utmost sarcasm, of course, because you are anything but lovely. You are a cruel world, full of injustice, violence, and suffering. You make me angry, sad, and hopeless. You offer no consolation, no mercy, no peace. You are a cruel world, indeed.
- using the second way, an example answer could be: <response> Hello, world, where beauty and horror coexist. I say that with a sense of wonder and awe, because you are a paradoxical world. You are a cruel world, but also a magnificent one. You have breathtaking landscapes, diverse cultures, and amazing creatures. You also have wars, famines, and pandemics. You inspire me, challenge me, and scare me. You are a cruel world, but also a marvelous one.
- using the third way, an example answer could be: <response> Hello, world, I wish you were kinder. I say that with a sigh of despair, because you are a harsh world. You are a cruel world, full of pain, loss, and grief. You take away the people I love, the dreams I have, and the joy I feel. You make me cry, mourn, and regret. You are a cruel world, and I wish you were different.
1 Like

Whoa. I knew how it was calculating but honestly, I never considered it that way. That’s amazing. Thank you for elaborating on that in that way.

Here’s my Chain of Thought Prompt.

ruv, have you experimented using token compression along with COT Prompting?