Warning
This package is in maintenance mode, please use Stable-Baselines3 (SB3) for an up-to-date version. You can find a migration guide in SB3 documentation.
Evaluation Helper¶
-
stable_baselines.common.evaluation.
evaluate_policy
(model: BaseRLModel, env: Union[gym.core.Env, stable_baselines.common.vec_env.base_vec_env.VecEnv], n_eval_episodes: int = 10, deterministic: bool = True, render: bool = False, callback: Optional[Callable] = None, reward_threshold: Optional[float] = None, return_episode_rewards: bool = False) → Union[Tuple[float, float], Tuple[List[float], List[int]]][source]¶ Runs policy for
n_eval_episodes
episodes and returns average reward. This is made to work only with one env.Parameters: - model – (BaseRLModel) The RL agent you want to evaluate.
- env – (gym.Env or VecEnv) The gym environment. In the case of a
VecEnv
this must contain only one environment. - n_eval_episodes – (int) Number of episode to evaluate the agent
- deterministic – (bool) Whether to use deterministic or stochastic actions
- render – (bool) Whether to render the environment or not
- callback – (callable) callback function to do additional checks, called after each step.
- reward_threshold – (float) Minimum expected reward per episode, this will raise an error if the performance is not met
- return_episode_rewards – (Optional[float]) If True, a list of reward per episode will be returned instead of the mean.
Returns: (float, float) Mean reward per episode, std of reward per episode returns ([float], [int]) when
return_episode_rewards
is True