Calling all dungeon masters, I need a few testers (UPDATE!)

I’m building out an AI dungeon master, and it’s almost ready. If you’d be interested in playing a campaign as a PC, reply with your DM experience. If you have experience in multiple versions of DnD that’s a huge plus, really the more the better. Either way it should be pretty cool in practice. Voice/image narration, persistent world via python database. The cool feature that I need to finish up is it will operate on discord, allowing individual players to play their characters in between sessions. The AI will update the world database accordingly in real time. Imagine ending a group session at the inn, between sessions you’re free to explore and make real impacts on the group session. Want to steel from a party member, go ahead. If you succeed The player will have no idea until he goes to use said item. You get the idea. I have some experience as a player, but none as a DM. Thus, I need former DM’s to play an adventure with me, and provide feedback on the AI DMs behavior.


Hey theravak, I play D&D for over 30 years now, weekly, and I was thinking about the same thing. I am happy to do some testing.

Send me a friend request on discord. theravak#1526.

I like the idea. I was trying to builld a Dungeon Master GPT, but I soon realized that that the GPT was absolutely useless to remember Health, buffs and debuffs. The story building was great, the world was great. Everything was perfect, till the GPT had to remember something. Unfortunately that was the end of my Dungeon Master GPT journey. Maybe when I get the honor of being part of the memory rollout, things might change. Im pretty sure you solution will be better, with the database.
Kudos to you, great idea. :slight_smile:

1 Like

Nice! As someone who has been building AI RPG tools for 3+ years now, I’m interested in what you’ve got going…

Using Actions to save worldstate?

ETA: Thanks for sharing with us. We just ask that you keep updates in this thread, so it’s easier for us to keep updated. Thanks!

Ive been a game master for legend of the five rings for 10 years. Ive tried to make my own bot recently and had some success. Ultimately that project has had some issues due to limited coding experience. Id really like a chance to try another ttrpg system for you.

Update! I have begun manually transcribing the OGL into a well formatted .txt file to train the model(s). I’ve decided to go the DistilBERT operating in a “Master” role directing information between the player facing LLM (if DistilBERT can’t produce output in a well organized narrative fashion, I was thinking Claude or Gemini via API, but I’m open to suggestions)), and the Python database I’m building out. I’m using SQlite3 to be able to reference data as JSON values.(since literally all database entries will be whole number values or strings) I want to try to get a test group together within the next week or so to see how it handles the basic setup. Then I plan on giving the trained DistilBERT model access to a MoE(3 or 4 Jurassic-1 Jumbo “slave” models trained on my transcribed OGL as well as role(combat, lore, dialogue) specific data. This should allow the Master DistilBERT model to focus on database accuracy/maintenance and “final-call” decisions regarding player actions as they relate to the core D20 system rules. All said I’m excited to begin training the model and getting a test adventure put together. I’d like to get a test session done (with me acting as an intermediary of sorts and just testing the models output during gameplay) in the next couple weeks, and I need experienced players or DMs to help as my DnD player experience is 20 years go, and my DM experience is nil. Stay tuned if your interested. If you have knowledge of training LLMs and getting them operational in an MoE setup, I’d love to contact you to bounce ideas off of.

I’ll be looking for playtesters in the next week or two.

I’ll be running a playtest in a week or two, and also I added an update about the project.

Yeah even with perfect prompting, it’s going to start hallucinating without correct real time data to reference. I’m hoping I can just use Python, SQLite, and a DistilBERT model trained on the OGL, several adventure modules, and some discretionary training data(Sun Tzu “The Art of War” comes to mind, as well as any DnD based novels like RR Salvatore’s work). I’m going to implement the MoE eventually because I want to know how to set it up. All this is really just a case study for me to learn as much about AI as I can, so my end vision is DistilBERT networked with a python database and the MoE comprised of three explicitly trained Jurassic-1 Jumbo models(combat expert, lore and world-crafting expert, and a TBD third expert(dialogue or something). I theorize that with a bit more compute (another 4000 series GPU), this setup could easily scale with plug and play for additional experts in the MoE, as well as being able to tackle my plans to provide image narration once stable diffusion 3 is available. The only think I’ve yet to figure out is, once I have stable diffusion 3 to hopefully produce quality detailed maps of locations for me, I’ll need a coordinate system in the database to save both world location and local location. This is really a tool to allow DM’s to craft powerful and detailed narratives by just working with a guiding the Master DistilBERT Model, although I intend on getting it working in a sandbox mode. Think Dwarf Fortress but text only. I’m setting up the database carefully using IF NOT EXIST CREATE functions so once the model is trained, it should be able to just start a group off somewhere that it already knows about in it’s lore data, and allow for dynamic questing where the model is free to build out the world as it needs creating new towns, factions dungeons. These AI generated parts of the world would have their starting tables in the database built out and populated by DistilBERT, and would save in the world-state data for future adventures to explore as well. It sounds like a lot but once I have the database where I want it to start, training the models shouldn’t take more than a couple days. I’ll say a week of me tightening the screws or so and I should be ready to let her cook. I’ll keep this thread informed every few days, and I look forward to testing my project with anyone willing!

Also, explain memory rollout please. Is this database building pointless in the long run?(I’ll still do it for the experience just wondering)


Disclaimer first: I’m a grumpy developer who’s seen the absurd of corporate IT projects. I look at IT projects from the pessimistic viewpoint, so don’t be discouraged by my comments. You and only you know what projects are fun for you.

Disclaimer #2: I got almost zero experience in playing RPGs. Frankly speaking I always find it amusing how players can trust GM that she/he is not picking sides :wink:

I don’t think you should focus on training or fine-tuning models for your use case. Instead you should focus on prompting and framing the model. When you focus on this “low-level” (or better to say “sidequest”) stuff it distracts you from the original idea. It eats your energy that you planned to spend elsewhere. So I suggest that you take one of the ready LLM APIs and work with that.

Automated creative worldbuilding I find fascinating. You must somehow ensure creativity and consistency, where the world is already determined. I’m sure the issue that LLMs can’t do the “consistency” part well is fundamental and can’t be changed. So you must embed LLM in a system that ensures the consistency. There is this algorithm called “wave function collapse”, but it doesn’t use LLMs. It contains a randomization step and I think it might be substituted by LLM-based choice.

But wave function collapse is not good enough for immersive worlds I think. I don’t know exactly how to say it – it operates on a single level of detail. It operates on tiles (or cubes, or whatever dimensionality you want) of constant size. Anything smaller than the tile is not randomized (not creative), anything larger than few tiles lacks “sense” – I mean WFC can’t “see” patterns larger than few tiles. So maybe some “recursive WFC”? Nested WFCs, one for every level of detail? Just speculating.

Instead of storing JSONs in the database, store individual values in individual columns. It is considered a good practice. First level of database normalization. JSON or key-value columns are good only if scheme of your data is unknown at the time you write the system.

Let me know if my whining is heplful at all :wink: cheers!

No your right on getting distracted, but fortunately for me I can work on this project almost full time as I had planned to go back to school this year but instead I’m just going to self teach myself certain skills with the help of Gemini. I guess the big grey area is training the models and getting them to produce outputs that fit within the DnD d20 system framework. Gemini said that given my small dataset(manually transcribed .txt scraped from the Wizard’s of the Coast Open Game License for 5th Edition, and a handful of adventure modules), far below a single gigabyte, that the MoE is probably overkill. I’ll have to see how it does once I load the trained model into memory. When I first started contemplating it, I kept thing of the MUDs from the 80s/90s early internet. I plan on doing research to see if I can emulate how they maintained world states.

Very insightful, I almost had a heart attack because halfway through reading your response I had a power surge and my computer wouldn’t boot back up, but it’s working now. I had initially set it up using INTEGER and VARCHAR values, but as I built functions to fill in my starting database, I realized that all my values would either be text strings or whole numbers. So I decided to put that on hold since I have to get the rule books transcribed before anything really. Was going to use Pycharm with the Tabnine coding assistant. I don’t think I fully understand JSON properly. Is it a data type or a SQL command interprets data as a string value when talking to the database? Either way, I feel like there should be a single value type the would cover -100 to 100 AND the rest of the data consisting of test strings. Let me know if I’m on the right track or way off, because the one thing I DON’T wanna do is build this out with the help of AI and then have to do it all over again down the road because I forgot to clear something with Gemini lol.

If you have some code on github or somewhere maybe you can share it? I can look at it and maybe suggest some solutions?

I guess what I’m asking is considering there were like 32 (34 really but I had a metadata column and one for a unique numerical identifier for every NPC) values to define for each character(basically just used a blank 5E character sheet and created a function to define all the information that a standard player would have for his character), and I know for a fact that that’ll never change(unless 6th Edition all the sudden incorporates floating point math), and from what I understand the LLM I’m using can do simple math by calling string values with JSON? It makes sense in my head, but I’m getting into uncharted territory when it comes to Python and SQL. Any clarification you can offer will be great, nevertheless I have a good starting point for a new Gemini chat.

Thanks that’s awesome, I’ll reach out next week with some snippets once I start the new database. I’m going to finish transcribing the .pdfs and try to train the model just to see how it handles me throwing complex situational rule checks it’s way.

To me it looks like you exactly know what parameters a NPC have, so it is best to store every parameter in separate column. If a month or year later it appears that you need additional column you can always modify the database (like ALTER TABLE ... ADD column_name).

I don’t understand what you mean by “LLM can do simple math by calling string values with JSON”. LLM outputs just text and you can interpret it as JSON containing instructions of what math operations to perform. But often this is arbitrary choice – you can choose YAML or XML instead and it will be as good.

So basically my trained LLM will output text that is then interpreted by the database via JSON? I think I was confusing JSON as a type of data like INT or VARCHAR. This does help though, am I correct in assuming I should consider JSON the intermediary language between Python scripts and my LLM? The LLM will just be reading values 99% of the time, the “builder” or “sandbox” mode I’m envisioning will come way later, my goal now is to get the model trained and producing quality outputs. Can values like VARCHAR contain numbers? Is there a maximum length? At the end of the day as long as the LLM can understand the fact that if it references a value that has numerical data, there’s some sort of math that needs to be done most likely. Then on the flip side, any text data it has to reference is for decisions regarding rules or it needs the accurate text data to incorporate into its output for some reason. Thanks so much for you’re help I’ll be in touch. I’m starting to confuse myself now lol.

I’ve played D&D for years. I’ve DM’d a few times. Would be happy to play test if you’re still looking.