I'd like to add AI text parsing to my game and I have a few questions. [what tools / model, bundling, etc]

Good day, everyone.

I’m working on a small cognitive development game. The game presents the player a photo and in the background creates an array that contains factors and properties about the image. For example [“contains dogs”, “woman in a dress”, "umbrella on the window sill ", “pot on a stove”, “bowl of apples on the coffee table”, “bird on a tree branch”, “dog sleeping on a mat”]

The user needs to create a small report / an essay of the image and the AI would analyze if the text contains mentions of the factors from the array. So it’s all text based operations.

CHATgpt4, through the web browser, was able to do this flawlessly. I was thoroughly impressed at how accurately it could do this. I even asked it to provide a simple response as a percentage of how many factors were mentioned and it did.

The images and the factors about the images are pre-baked and already exist in the game. They’re not generated. The AI would only need to report on how many of the factors from an array, the user has mentioned in their essay.


  1. How can I incorporate this / bundle it with a game I can ship? I’m preferably hoping there’s something that can run locally on a users computer instead of web-apis. As web-api calls might get expensive, especially when trolls could catch wind of it and go on an API call spree. I can technically see it running as a secondary service in the background, alongside the game, maybe as a localhost web application and talk to it using html web-requests / web-api. That or physical file on the drive and have the app detect changes to it, then output a different file, and have the game open that file to read the response. What else? It only needs to happen once.

  2. What models can I try using that can do this? Specifically looking for the models that are capable of analyzing if an essay mentions specific subjects from the array.

  3. How would I be able to bundle this together with the game so that the user doesn’t have to install dependencies and pip packages?

1 Like

Unity has the UnityWebRequest way of making calls to eg. OpenAI’s API, so you could use that for the analysis. But if you’re doing the “umbrella on the windowsill” analysis at runtime, like with some kind of computer vision, and its not pre-baked for each image, then its very different (though if a vision endpoint is available, you could against use Unity and its webrequest class)

1 Like

So really any engine that is capable of web requests can do this.

The only issue I have with talking to openAI directly is that it might end up expensive with a lot of users parsing requests. I don’t suppose there’s a model that is dedicated to just this sort of task that can be run locally?

Also I would imagine that the request would have to be sent to my local webserver that will process the request, right? Since I have to make use of OpenAI credentials and it probably won’t be a good idea to store them in-game.

Welcome to the forum.

Yes, the costs do add up quickly, especially with GPT-4-Vision… I’m not even sure the rate-limit is high-enough for what you describe right now, though that will change in the future.

Here’s the game I’ve been tinkering with. It’s been on hold while I think through some things - mainly cost…

You can see my progress this summer in this thread where I documented my journey…

Hope you stick around. We’ve got a great community with a lot of talented people.

You’re totally right, I’m thinking in builder mode rather than release. I guess whatever app, can contact your server as a middleman with the prompt that does the actual API interaction, then gets relayed back to the user