Good day, everyone.
I’m working on a small cognitive development game. The game presents the player a photo and in the background creates an array that contains factors and properties about the image. For example [“contains dogs”, “woman in a dress”, "umbrella on the window sill ", “pot on a stove”, “bowl of apples on the coffee table”, “bird on a tree branch”, “dog sleeping on a mat”]
The user needs to create a small report / an essay of the image and the AI would analyze if the text contains mentions of the factors from the array. So it’s all text based operations.
CHATgpt4, through the web browser, was able to do this flawlessly. I was thoroughly impressed at how accurately it could do this. I even asked it to provide a simple response as a percentage of how many factors were mentioned and it did.
The images and the factors about the images are pre-baked and already exist in the game. They’re not generated. The AI would only need to report on how many of the factors from an array, the user has mentioned in their essay.
How can I incorporate this / bundle it with a game I can ship? I’m preferably hoping there’s something that can run locally on a users computer instead of web-apis. As web-api calls might get expensive, especially when trolls could catch wind of it and go on an API call spree. I can technically see it running as a secondary service in the background, alongside the game, maybe as a localhost web application and talk to it using html web-requests / web-api. That or physical file on the drive and have the app detect changes to it, then output a different file, and have the game open that file to read the response. What else? It only needs to happen once.
What models can I try using that can do this? Specifically looking for the models that are capable of analyzing if an essay mentions specific subjects from the array.
How would I be able to bundle this together with the game so that the user doesn’t have to install dependencies and pip packages?