Deepmind SIMA and training in multiple environments

The SIMA paper mentioned that training on multiple games yielded a better result than specializing on JUST the evaluated game. Even if the game wasn’t in the models training set, and was approached in a “zero shot” manner, sometimes the “renaissance model” would even outperform the narrowly trained model (Goat Simulator, Satisfactory).

To what extent is it just, because it got better at vision, versus something higher level? Surely a tree in Hydroneer, Valheim, and Satisfactory all look different, so SIMA would be much better at “go to that tree” because it has a robust idea of what a tree can look like.

So is the benefit specifically limited to vision? Or can the benefits exist at the conceptual level, like would a GUI-based puzzle game about alchemy benefit a model that plays a text adventure about baking? Is it possible to work backwards, and start from a target domain, and let an LLM design a vareity of games for the model to train on? Game Developer Conference 2024 starts tomorrow, it’ll be interesting to see how many develops have pivoted into providing bespoke simulation environments for humanoid robots.

1 Like

Are you talking about this one?

It’s possible that I don’t understand it, but the only thing that this paper seems to be saying, as far as I can tell, is: “our handcrafted agent is better than a gimped agent”.

I don’t see this in the data tbh. :thinking:

If the horizontal line at 100% relative performance represents the specizlied agent, in a few cases (eg Satisfactory) the zero shot agent that was trained on most of the other games, but not Satisfactory, did a little better job. By gimped, are you saying you don’t think the environment-specialized agent is a good benchmark?