


A maze similar to Lava World in OpenAI Gym


Create a maze environment.

Class CustomLavaWorldEnv

class CustomLavaWorldEnv[source]#

Bases: MazeEnv

A maze similar to Lava World in OpenAI Gym

__init__(map_size=15, mode=None, **kwargs)[source]#
  • map_size – A single int or a tuple (h, w), representing the map size.

  • visible_size – A single int or a tuple (h, w), representing the visible size. The agent will at the center of the visible window, and out-of-border part will be colored by obstacle color.

  • obs_ratio – Obstacle ratio (how many obstacles will be in the map).

  • enable_path_checking – Enable path computation in map construction. Turn it down only when you are sure about valid maze.

  • random_action_mapping – Whether to enable random action mapping. If true, the result of performing every action will be shuffled. _checkingIf a single bool True is provided, we do random shuffle. Otherwise, it should be a list with same length as action space (5 when noaction enabled, 4 otherwise).

  • enable_noaction – Whether to enable no-action operation.

  • dense_reward – Whether the reward is dense.

  • reward_move – Reward for a valid move. For dense reward setting, it should be a positive number. While in sparse reward setting, it is expected to be a non-positive number.

  • reward_noaction – Reward for a no-action.

  • reward_final – Reward when you arrive at the final point.

  • reward_error – Reward when you perform an invalid move.

  • state_mode – State mode, either ‘DEFAULT’ or ‘RENDER’.

append_stat(name, value)#
finish(*args, **kwargs)#
play_one_episode(func, ret_states=False, ret_actions=False, restart_kwargs=None, finish_kwargs=None, max_steps=10000)#
restart(obstacles=None, start_point=None, final_point=None)[source]#
property action_delta#

the tuple (dy, dx) when you perform action i


Action deltas

property action_mapping#

If random action mapping is enabled, return the internal mapping

property action_space#
property canvas#

Return the raw canvas (full)

property canvas_size#

Canvas size

property current_point#

Current point (r, c)

property current_state#
property distance_mat#

Distance matrix

property distance_prev#

Distance-prev matrix

property final_point#

Finish point (r, c)

property inv_distance_mat#
property inv_distance_prev#
property lv_finals#
property lv_obstacles#
property lv_starts#
property map_size#

Map size

property obstacles#
property origin_canvas#

Return the original canvas (at time 0, full)

property quick_distance_mat#

this is done during the first run of SPFA, so if you ensure that all valid points are in the same connected component, you can use it


Distance matrix

property quick_distance_prev#

see also quick_distance_mat


Distance-prev matrix

property rewards#

A tuple of 4 value, representing the rewards for each action: (Move, Noaction, Arrive final point, Move Err)

property shortest_path#

One of the shortest paths from start to finish, list of point (r, c)

property start_point#

Start point (r, c)

property stats#
property unwrapped#
property visible_size#

Visible size

Class MazeEnv

class MazeEnv[source]#

Bases: SimpleRLEnvBase

Create a maze environment.

__init__(map_size=14, visible_size=None, obs_ratio=0.3, enable_path_checking=True, random_action_mapping=None, enable_noaction=False, dense_reward=False, reward_move=None, reward_noaction=0, reward_final=10, reward_error=-2, state_mode='DEFAULT')[source]#
  • map_size – A single int or a tuple (h, w), representing the map size.

  • visible_size – A single int or a tuple (h, w), representing the visible size. The agent will at the center of the visible window, and out-of-border part will be colored by obstacle color.

  • obs_ratio – Obstacle ratio (how many obstacles will be in the map).

  • enable_path_checking – Enable path computation in map construction. Turn it down only when you are sure about valid maze.

  • random_action_mapping – Whether to enable random action mapping. If true, the result of performing every action will be shuffled. _checkingIf a single bool True is provided, we do random shuffle. Otherwise, it should be a list with same length as action space (5 when noaction enabled, 4 otherwise).

  • enable_noaction – Whether to enable no-action operation.

  • dense_reward – Whether the reward is dense.

  • reward_move – Reward for a valid move. For dense reward setting, it should be a positive number. While in sparse reward setting, it is expected to be a non-positive number.

  • reward_noaction – Reward for a no-action.

  • reward_final – Reward when you arrive at the final point.

  • reward_error – Reward when you perform an invalid move.

  • state_mode – State mode, either ‘DEFAULT’ or ‘RENDER’.

append_stat(name, value)#
finish(*args, **kwargs)#
play_one_episode(func, ret_states=False, ret_actions=False, restart_kwargs=None, finish_kwargs=None, max_steps=10000)#
restart(obstacles=None, start_point=None, final_point=None)[source]#
property action_delta#

the tuple (dy, dx) when you perform action i


Action deltas

property action_mapping#

If random action mapping is enabled, return the internal mapping

property action_space#
property canvas#

Return the raw canvas (full)

property canvas_size#

Canvas size

property current_point#

Current point (r, c)

property current_state#
property distance_mat#

Distance matrix

property distance_prev#

Distance-prev matrix

property final_point#

Finish point (r, c)

property inv_distance_mat#
property inv_distance_prev#
property map_size#

Map size

property obstacles#
property origin_canvas#

Return the original canvas (at time 0, full)

property quick_distance_mat#

this is done during the first run of SPFA, so if you ensure that all valid points are in the same connected component, you can use it


Distance matrix

property quick_distance_prev#

see also quick_distance_mat


Distance-prev matrix

property rewards#

A tuple of 4 value, representing the rewards for each action: (Move, Noaction, Arrive final point, Move Err)

property shortest_path#

One of the shortest paths from start to finish, list of point (r, c)

property start_point#

Start point (r, c)

property stats#
property unwrapped#
property visible_size#

Visible size