Class MultiArmedBandits

Inheritance Relationships

Base Type

Class Documentation

class MultiArmedBandits : public bitrl::envs::EnvBase<TimeStep<bool>, MultiArmedBanditsSpace>

class MultiArmedBandits. Environment for simulating armed-bandits The bandits are represented as Bernoulli distribution. At each step only one bandit can be executed

Public Types

typedef EnvBase<TimeStep<bool>, MultiArmedBanditsSpace> base_type

The base type.

typedef base_type::time_step_type time_step_type

The time step type we return every time a step in the environment is performed.

typedef base_type::state_space_type state_space_type

The type describing the state space for the environment.

typedef base_type::action_space_type action_space_type

The type of the action space for the environment.

typedef base_type::action_type action_type

The type of the action to be undertaken in the environment.

typedef base_type::state_type state_type

The type of the action to be undertaken in the environment.

Public Functions

MultiArmedBandits()

MultiArmedBandits Constructor.

virtual void make(const std::string &version, const std::unordered_map<std::string, std::any> &options) final override

make. Builds the environment.

Parameters:
  • version. – the version of the environment to build

  • options. – Options to use for building the environment. Concrete classes may choose to hold a copy

virtual void close() final override

close the environment

virtual time_step_type reset(uint_t seed, const std::unordered_map<std::string, std::any> &options) final override

Reset the environment.

Parameters:
  • seed. – The seed to use for resetting the environment

  • options. – Options to use for resetting the environment.

virtual time_step_type step(const action_type &action) final override

step in the environment by performing the given action

Parameters:

action. – The action to execute in the environment

Returns:

An instance of time_step_type

inline uint_t n_actions() const noexcept

Return the number of actions.

inline real_t success_reward() const noexcept

Returns the sucees reward.

inline real_t fail_reward() const noexcept

Returns the fail reward.

Public Static Attributes

static const std::string name = "MultiArmedBandits"

name