gxm.wrappers.Discretize#

class Discretize(env, actions, unwrap=True)#

Bases: Wrapper

Wrapper that discretizes a continuous action space. Maps a discrete set of actions to the continuous action space of the environment. The actions are specified as a list of continuous actions \(A\). The action space of the wrapped environment is then \(\{0, 1, \ldots, |A|-1\}\).

>>> import gxm
>>> from gxm.wrappers import Discretize
>>> env = make("Gymnasium/Pendulum-v1")
>>> actions = jnp.array([-2.0, 0.0, 2.0])
>>> env = Discretize(env, actions)

The actions passed to the Discretize wrapper need to be of shape \((|A|, D)\), where \(|A|\) is the number of discrete actions and \(D\) is the dimensionality of the continuous action space of the wrapped environment.

__init__(env, actions, unwrap=True)#
Parameters:
  • env (Environment) – The environment to wrap.

  • actions (Any) – The discrete set of actions to map to.

  • unwrap (bool) – Whether to unwrap the environment or treat it as part of the base environment.

Methods

__init__(env, actions[, unwrap])

get_wrapper(wrapper_type)

Retrieve the first wrapper of a specific type from the environment.

has_wrapper(wrapper_type)

Check if the environment or any of its wrappers is of a specific type.

init(key)

Initialize the environment and return the initial state.

reset(key, env_state)

Reset the environment to its initial state.

step(key, env_state, action)

Perform a step in the environment given an action.

Attributes

unwrap

unwrapped

Retrieve the base environment by unwrapping all wrappers.

env

actions

id

The unique identifier of the environment.

action_space

The action space of the environment.

observation_space

The observation space of the environment.

action_space: Space#

The action space of the environment.

actions: Any#
env: Environment#
id: str#

The unique identifier of the environment.

init(key)#

Initialize the environment and return the initial state.

Parameters:

key (Array) – A JAX random key for any stochastic initialization.

Return type:

tuple[EnvironmentState, Timestep]

Returns:

A tuple containing the initial environment state and the initial timestep.

observation_space: Space#

The observation space of the environment.

reset(key, env_state)#

Reset the environment to its initial state.

Parameters:
  • key (Array) – A JAX random key for any stochasticity in the environment.

  • env_state (EnvironmentState) – The current state of the environment.

Return type:

tuple[EnvironmentState, Timestep]

Returns:

A tuple containing the reset environment state and the initial timestep.

step(key, env_state, action)#

Perform a step in the environment given an action.

Parameters:
  • key (Array) – A JAX random key for any stochasticity in the environment.

  • env_state (EnvironmentState) – The current state of the environment.

  • action (Any) – The action to take in the environment.

Return type:

tuple[EnvironmentState, Timestep]

Returns:

A tuple containing the new environment state and the resulting timestep.