gxm.wrappers.Discretize#

class Discretize(env, actions, unwrap=True)#

Bases: Wrapper

Wrapper that discretizes a continuous action space. Maps a discrete set of actions to the continuous action space of the environment. The actions are specified as a list of continuous actions \(A\). The action space of the wrapped environment is then \(\{0, 1, \ldots, |A|-1\}\).

>>> import gxm
>>> from gxm.wrappers import Discretize
>>> env = make("Gymnasium/Pendulum-v1")
>>> actions = jnp.array([-2.0, 0.0, 2.0])
>>> env = Discretize(env, actions)

The actions passed to the Discretize wrapper need to be of shape \((|A|, D)\), where \(|A|\) is the number of discrete actions and \(D\) is the dimensionality of the continuous action space of the wrapped environment.

__init__(env, actions, unwrap=True)#

Parameters:

env (Environment) – The environment to wrap.
actions (Any) – The discrete set of actions to map to.
unwrap (bool) – Whether to unwrap the environment or treat it as part of the base environment.

Methods

`__init__`(env, actions[, unwrap])
`get_wrapper`(wrapper_type)	Retrieve the first wrapper of a specific type from the environment.
`has_wrapper`(wrapper_type)	Check if the environment or any of its wrappers is of a specific type.
`init`(key)	Initialize the environment and return the initial state.
`reset`(key, env_state)	Reset the environment to its initial state.
`step`(key, env_state, action)	Perform a step in the environment given an action.

Attributes

`unwrap`
`unwrapped`	Retrieve the base environment by unwrapping all wrappers.
`env`
`actions`
`id`	The unique identifier of the environment.
`action_space`	The action space of the environment.
`observation_space`	The observation space of the environment.

action_space: Space#: The action space of the environment.

actions: Any#

env: Environment#

id: str#: The unique identifier of the environment.

init(key)#

Initialize the environment and return the initial state.

Parameters:: key (Array) – A JAX random key for any stochastic initialization.
Return type:: tuple[EnvironmentState, Timestep]
Returns:: A tuple containing the initial environment state and the initial timestep.

observation_space: Space#: The observation space of the environment.

reset(key, env_state)#

Reset the environment to its initial state.

Parameters:

key (Array) – A JAX random key for any stochasticity in the environment.
env_state (EnvironmentState) – The current state of the environment.

Return type:

tuple[EnvironmentState, Timestep]

Returns:

A tuple containing the reset environment state and the initial timestep.

step(key, env_state, action)#

Perform a step in the environment given an action.

Parameters:

key (Array) – A JAX random key for any stochasticity in the environment.
env_state (EnvironmentState) – The current state of the environment.
action (Any) – The action to take in the environment.

Return type:

tuple[EnvironmentState, Timestep]

Returns:

A tuple containing the new environment state and the resulting timestep.

gxm.wrappers.Discretize

Contents

gxm.wrappers.Discretize#