PAPER: Are ChatGPT and GPT-4 Good Poker Players?

Since the introduction of ChatGPT and GPT-4, these models have been tested across a large number of tasks. Their adeptness across domains is evident, but their aptitude in playing games, and specifically their aptitude in the realm of poker has remained unexplored. Poker is a game that requires decision making under uncertainty and incomplete information. In this paper, we put ChatGPT and GPT-4 through the poker test and evaluate their poker skills. Our findings reveal that while both models display an advanced understanding of poker, encompassing concepts like the valuation of starting hands, playing positions and other intricacies of game theory optimal (GTO) poker, both ChatGPT and GPT-4 are NOT game theory optimal poker players.

Profitable strategies in poker are evaluated in expectations over large samples. Through a series of experiments, we first discover the characteristics of optimal prompts and model parameters for playing poker with these models. Our observations then unveil the distinct playing personas of the two models. We first conclude that GPT-4 is a more advanced poker player than ChatGPT. This exploration then sheds light on the divergent poker tactics of the two models: ChatGPT’s conservativeness juxtaposed against GPT-4’s aggression. In poker vernacular, when tasked to play GTO poker, ChatGPT plays like a nit, which means that it has a propensity to only engage with premium hands and folds a majority of hands. When subjected to the same directive, GPT-4 plays like a maniac, showcasing a loose and aggressive style of play. Both strategies, although relatively advanced, are not game theory optimal.


I think there’s still opportunity there with a tool-assisted GPT-4 poker playing bot.


I created a GPT for playing poker to see how good it is: Poker GPT. I don’t play poker, so I would like other people’s opinions on how ChatGPT performs.

  1. No it is not good at poker, at least by default
  2. Playing GTO doesn’t mean you will be profitable, more likely you will be losing

Based on my experience with GPT-4 playing Tic-Tac-Toe, and failing at that, I am not really surprised that there are some obstacles to solve when teaching a language model to play poker.

I’d say one needs a function to get the probabilities and a prompt that provides a playstyle for the model to choose the next action.

Going back to the Tic-Tac-Toe case, if the model gets a complete analysis of the board and the possible actions the error rate drops dramatically.
This is in line with the line of thinking that a LLM by itself is not the final solution but merely a part of a whole system to execute somewhat fuzzy logic.