Agent

class sorrel.agents.Agent(observation_spec: ObservationSpec, action_spec: ActionSpec, model: BaseModel, location=None)

An abstract class for agents, a special type of entities.

Note that this is a subclass of agentarium.entities.Entity.

observation_spec

The observation specification to use for this agent.

Type:: sorrel.observation.observation_spec.ObservationSpec

model

The model that this agent uses.

Type:: sorrel.models.base_model.BaseModel

action_space: The range of actions that the agent is able to take, represented by a list of integers.

Warning

Currently, each element in action_space should be the index of that element. In other words, it should be a list of neighbouring integers in increasing order starting at 0.

For example, if the agent has 4 possible actions, it should have action_space = [0, 1, 2, 3].

Attributes that override parent (Entity)’s default values:

has_transitions - Defaults to True instead of False.

Methods

Abstract Methods

abstractmethod Agent.reset() → None: Reset the agent (and its memory).

abstractmethod Agent.pov(world: W) → ndarray

Defines the agent’s observation function.

Parameters:: env (Gridworld) – the environment that this agent is observing.
Returns:: the observed state.
Return type:: torch.Tensor

abstractmethod Agent.get_action(state: ndarray) → int

Gets the action to take based on the current state from the agent’s model.

Parameters:: state (torch.Tensor) – the current state observed by the agent.
Returns:: the action chosen by the agent’s model given the state.
Return type:: int

abstractmethod Agent.act(world: W, action: int) → float

Act on the environment.

Parameters:

env (Gridworld) – The environment in which the agent is acting.
action – an element from this agent’s action space indicating the action to take.

Returns:

the reward associated with the action taken.

Return type:

float

abstractmethod Agent.is_done(world: W) → bool

Determines if the agent is done acting given the environment.

This might be based on the experiment’s maximum number of turns from the agent’s cfg file.

Parameters:: env (Gridworld) – the environment that the agent is in.
Returns:: whether the agent is done acting. False by default.
Return type:: bool

Non-Abstract Methods

Agent.add_memory(state: ndarray, action: int, reward: float, done: bool) → None

Add an experience to the memory.

Parameters:

state (np.ndarray) – the state to be added.
action (int) – the action taken by the agent.
reward (float) – the reward received by the agent.
done (bool) – whether the episode terminated after this experience.

Agent.transition(world: W) → None

Processes a full transition step for the agent.

This function does the following: - Get the current state from the environment through pov() - Get the action based on the current state through get_action() - Changes the environment based on the action and obtains the reward through act() - Determines if the agent is done through is_done()

Parameters:: env (Gridworld) – the environment that this agent is acting in.