envs¶
envs ¶
rlib's environment subpackage.
Targets the modern Gymnasium 5-tuple API directly:
- :class:
RLEnv— abstract base class for rlib wrappers (provides__getattr__delegation,unwrapped, context-manager support). - :class:
RLVecEnv— abstract base for vectorised env runners (BatchEnvandDummyBatchEnv). - :func:
make— re-export of :func:gymnasium.make.
Built-in custom envs (ApplePicker-v0, ApplePickerDeterministic-v0)
are registered with Gymnasium at import time so gymnasium.make("ApplePicker-v0")
works out of the box.
RLEnv ¶
Bases: ABC
Concrete-friendly base class for rlib adapters and wrappers.
Subclasses must implement :meth:reset and :meth:step to honour
the canonical 5-tuple / (obs, info) contract. All other
convenience members (unwrapped, __getattr__ forwarding,
context-manager support, render, close, spec, ...)
delegate to self.env when present so wrapper classes get sane
defaults for free.
A subclass that does not wrap another env (e.g. a hand-written
simulator) should set self.env = None and override the relevant
members directly.
reset
abstractmethod
¶
reset(*, seed: Any = None, options: Any = None) -> tuple[Any, dict]
Reset the environment and return (obs, info).
Source code in rlib/envs/base.py
53 54 55 | |
step
abstractmethod
¶
step(action: Any) -> tuple[Any, float, bool, bool, dict]
Step the environment and return the modern 5-tuple.
Source code in rlib/envs/base.py
57 58 59 | |
RLVecEnv ¶
Bases: ABC
Abstract base for rlib's vectorised environment runners.
Agent rollout code in this library has historically consumed the
legacy 4-tuple (obs, rewards, dones, infos). We keep that
agent-facing shape on purpose — the per-env 5-tuple lives on the
wrapper side, and RLVecEnv implementations are responsible for
collapsing terminated/truncated into a single done flag
in one place (this base class' :meth:merge_done helper).
reset
abstractmethod
¶
reset() -> Any
Return a stacked batch of initial observations.
Source code in rlib/envs/base.py
127 128 129 | |
step
abstractmethod
¶
step(actions: Any) -> Any
Step every sub-env and return (obs, rewards, dones, infos).
Source code in rlib/envs/base.py
131 132 133 | |
merge_done
staticmethod
¶
merge_done(terminated: bool, truncated: bool) -> bool
Single canonical place where done = terminated or truncated.
Centralised so future agents that want to distinguish the two (e.g. for correct value-bootstrapping on truncation) only need to change call sites here.
Source code in rlib/envs/base.py
141 142 143 144 145 146 147 148 149 | |
merge_info
staticmethod
¶
merge_info(info: dict, terminated: bool, truncated: bool) -> dict
Annotate info with the legacy TimeLimit.truncated key.
Mirrors Gymnasium's behaviour so any agent that inspects the info dict for truncation sees the same value regardless of backend.
Source code in rlib/envs/base.py
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | |
BatchEnv ¶
BatchEnv(env_constructor: Callable[..., RLEnv], env_id: str, num_envs: int, blocking: bool = False, make_args: dict | None = None, **env_args)
Bases: RLVecEnv
Run num_envs envs in parallel, one subprocess each.
Source code in rlib/envs/vec_env.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 | |
DummyBatchEnv ¶
DummyBatchEnv(env_constructor: Callable[..., RLEnv], env_id: str, num_envs: int, make_args: dict | None = None, **env_args)
Bases: RLVecEnv
Synchronous (in-process) vec env runner.
Lower overhead than :class:BatchEnv for cheap envs where
multi-processing is not worth it.
Source code in rlib/envs/vec_env.py
277 278 279 280 281 282 283 284 285 286 287 288 | |