Gymnasium (Single-Agent)
Wraps POGEMA as a standard single-agent Gymnasium environment. Only works with num_agents=1.
Basic Usage
from pogema import pogema_v0, GridConfig
env = pogema_v0(GridConfig(
integration='gymnasium',
num_agents=1,
size=8,
density=0.3,
))
obs, info = env.reset()
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
With Stable-Baselines3
from stable_baselines3 import PPO
from pogema import pogema_v0, GridConfig
env = pogema_v0(GridConfig(integration='gymnasium', num_agents=1, size=8))
model = PPO('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=100_000)
Key Differences from Multi-Agent API
step()takes a single int action, not a listreset()returns a single observation, not a listobservation_spaceandaction_spaceare for one agent- Compatible with any Gymnasium-based RL library