gxm.wrappers.Discretize#
- class Discretize(env, actions, unwrap=True)#
Bases:
WrapperWrapper that discretizes a continuous action space. Maps a discrete set of actions to the continuous action space of the environment. The actions are specified as a list of continuous actions \(A\). The action space of the wrapped environment is then \(\{0, 1, \ldots, |A|-1\}\).
>>> import gxm >>> from gxm.wrappers import Discretize >>> env = make("Gymnasium/Pendulum-v1") >>> actions = jnp.array([-2.0, 0.0, 2.0]) >>> env = Discretize(env, actions)
The actions passed to the
Discretizewrapper need to be of shape \((|A|, D)\), where \(|A|\) is the number of discrete actions and \(D\) is the dimensionality of the continuous action space of the wrapped environment.- __init__(env, actions, unwrap=True)#
- Parameters:
env (
Environment) – The environment to wrap.actions (
Any) – The discrete set of actions to map to.unwrap (
bool) – Whether to unwrap the environment or treat it as part of the base environment.
Methods
__init__(env, actions[, unwrap])get_wrapper(wrapper_type)Retrieve the first wrapper of a specific type from the environment.
has_wrapper(wrapper_type)Check if the environment or any of its wrappers is of a specific type.
init(key)Initialize the environment and return the initial state.
reset(key, env_state)Reset the environment to its initial state.
step(key, env_state, action)Perform a step in the environment given an action.
Attributes
unwrapunwrappedRetrieve the base environment by unwrapping all wrappers.
The unique identifier of the environment.
The action space of the environment.
The observation space of the environment.
- action_space: Space#
The action space of the environment.
-
actions:
Any#
-
env:
Environment#
- id: str#
The unique identifier of the environment.
- init(key)#
Initialize the environment and return the initial state.
- Parameters:
key (
Array) – A JAX random key for any stochastic initialization.- Return type:
tuple[EnvironmentState,Timestep]- Returns:
A tuple containing the initial environment state and the initial timestep.
- observation_space: Space#
The observation space of the environment.
- reset(key, env_state)#
Reset the environment to its initial state.
- Parameters:
key (
Array) – A JAX random key for any stochasticity in the environment.env_state (
EnvironmentState) – The current state of the environment.
- Return type:
tuple[EnvironmentState,Timestep]- Returns:
A tuple containing the reset environment state and the initial timestep.
- step(key, env_state, action)#
Perform a step in the environment given an action.
- Parameters:
key (
Array) – A JAX random key for any stochasticity in the environment.env_state (
EnvironmentState) – The current state of the environment.action (
Any) – The action to take in the environment.
- Return type:
tuple[EnvironmentState,Timestep]- Returns:
A tuple containing the new environment state and the resulting timestep.