Agent

class sorrel.agents.Agent(observation_spec: ObservationSpec, action_spec: ActionSpec, model: BaseModel, location=None)

An abstract class for agents, a special type of entities.

Note that this is a subclass of agentarium.entities.Entity.

observation_spec

The observation specification to use for this agent.

Type:

sorrel.observation.observation_spec.ObservationSpec

model

The model that this agent uses.

Type:

sorrel.models.base_model.BaseModel

action_space

The range of actions that the agent is able to take, represented by a list of integers.

Warning

Currently, each element in action_space should be the index of that element. In other words, it should be a list of neighbouring integers in increasing order starting at 0.

For example, if the agent has 4 possible actions, it should have action_space = [0, 1, 2, 3].

Attributes that override parent (Entity)’s default values:
  • has_transitions - Defaults to True instead of False.

Methods

Abstract Methods

abstractmethod Agent.reset() None

Reset the agent (and its memory).

abstractmethod Agent.pov(world: W) ndarray

Defines the agent’s observation function.

Parameters:

env (Gridworld) – the environment that this agent is observing.

Returns:

the observed state.

Return type:

torch.Tensor

abstractmethod Agent.get_action(state: ndarray) int

Gets the action to take based on the current state from the agent’s model.

Parameters:

state (torch.Tensor) – the current state observed by the agent.

Returns:

the action chosen by the agent’s model given the state.

Return type:

int

abstractmethod Agent.act(world: W, action: int) float

Act on the environment.

Parameters:
  • env (Gridworld) – The environment in which the agent is acting.

  • action – an element from this agent’s action space indicating the action to take.

Returns:

the reward associated with the action taken.

Return type:

float

abstractmethod Agent.is_done(world: W) bool

Determines if the agent is done acting given the environment.

This might be based on the experiment’s maximum number of turns from the agent’s cfg file.

Parameters:

env (Gridworld) – the environment that the agent is in.

Returns:

whether the agent is done acting. False by default.

Return type:

bool

Non-Abstract Methods

Agent.add_memory(state: ndarray, action: int, reward: float, done: bool) None

Add an experience to the memory.

Parameters:
  • state (np.ndarray) – the state to be added.

  • action (int) – the action taken by the agent.

  • reward (float) – the reward received by the agent.

  • done (bool) – whether the episode terminated after this experience.

Agent.transition(world: W) None

Processes a full transition step for the agent.

This function does the following: - Get the current state from the environment through pov() - Get the action based on the current state through get_action() - Changes the environment based on the action and obtains the reward through act() - Determines if the agent is done through is_done()

Parameters:

env (Gridworld) – the environment that this agent is acting in.