DDQN¶
DDQN ¶
Synchronous n-step Double DQN.
DDQNTrainerConfig
dataclass
¶
DDQNTrainerConfig(train_mode: TrainMode = TrainMode.NSTEP, returns: Returns = Returns.NSTEP, total_steps: int = 50000000, nsteps: int = 5, gamma: float = 0.99, lambda_: float = 0.95, validate_freq: int = 1000000, num_val_episodes: int = 50, max_val_steps: int = 10000, log_dir: str = 'logs/', model_dir: str = 'models/', save_freq: int = 0, log_scalars: bool = True, update_target_freq: int = 0, render_freq: int = 0, epsilon_start: float = 1.0, epsilon_final: float = 0.01, epsilon_steps: float = 1000000.0, epsilon_test: float = 0.01)
SyncDDQN ¶
SyncDDQN(envs, agent: DQN, target_agent: DQN, val_envs, action_size, config: DDQNTrainerConfig)
Bases: SyncMultiEnvTrainer
Synchronous Double-DQN trainer (n-step or one-step TD).
Source code in rlib/DDQN/trainer.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |