Core Specifications ------------------- Environment Class Overview ^^^^^^^^^^^^^^^^^^^^^^^^^^ The Environment class contains all necessary functions for the interactions between the agents and the environment. The base Environment class is derived from rllib.env.MultiAgentEnv (https://github.com/ray-project/ray). The functions are specific for each realization of Flatland (e.g. Railway, Vaccination,...) In particular, we retain the rllib interface in the use of the step() function, that accepts a dictionary of actions indexed by the agents handles (returned by get_agent_handles()) and returns dictionaries of observations, dones and infos. .. code-block:: python class Environment: """Base interface for multi-agent environments in Flatland. Agents are identified by agent ids (handles). Examples: >>> obs, info = env.reset() >>> print(obs) { "train_0": [2.4, 1.6], "train_1": [3.4, -3.2], } >>> obs, rewards, dones, infos = env.step( action_dict={ "train_0": 1, "train_1": 0}) >>> print(rewards) { "train_0": 3, "train_1": -1, } >>> print(dones) { "train_0": False, # train_0 is still running "train_1": True, # train_1 is done "__all__": False, # the env is not done } >>> print(infos) { "train_0": {}, # info for train_0 "train_1": {}, # info for train_1 } """ def __init__(self): pass def reset(self): """ Resets the env and returns observations from agents in the environment. Returns: obs : dict New observations for each agent. """ raise NotImplementedError() def step(self, action_dict): """ Performs an environment step with simultaneous execution of actions for agents in action_dict. Returns observations from agents in the environment. The returns are dicts mapping from agent_id strings to values. Parameters ------- action_dict : dict Dictionary of actions to execute, indexed by agent id. Returns ------- obs : dict New observations for each ready agent. rewards: dict Reward values for each ready agent. dones : dict Done values for each ready agent. The special key "__all__" (required) is used to indicate env termination. infos : dict Optional info values for each agent id. """ raise NotImplementedError() def render(self): """ Perform rendering of the environment. """ raise NotImplementedError() def get_agent_handles(self): """ Returns a list of agents' handles to be used as keys in the step() function. """ raise NotImplementedError()