Dynamic Valid Action List Environment in OpenAI Gym

Hello everyone, I’m doing a robotics reinforcement learning project.
Currently, my agent always has 12 actions options, but not for every state will all 12 actions be valid. For example, for one observation there might be actions 1, 2, 7 valid, and for another observation, there might be 3,5,6,8,9,11 actions valid, sometimes there is no valid action at all. The valid actions list will be given by a checker function I write based on the observation image. I want to write a custom environment and directly use some of the baselines RL library such as stable baselines3 or rllib, so my concern is how I should write these in my custom environment class and successfully use the RL libraries.
Thanks in advance!

1 Like

Hi, thank you for your response. I’ll definitely check this out. Btw, what are the AI and KM conversation about? Is this automatically generated?

1 Like