jaclearn.rl.envs.maze.maze#
Classes
A maze similar to Lava World in OpenAI Gym |
|
Create a maze environment. |
Class CustomLavaWorldEnv
- class CustomLavaWorldEnv[source]#
Bases:
MazeEnv
A maze similar to Lava World in OpenAI Gym
- __init__(map_size=15, mode=None, **kwargs)[source]#
- Parameters:
map_size – A single int or a tuple (h, w), representing the map size.
visible_size – A single int or a tuple (h, w), representing the visible size. The agent will at the center of the visible window, and out-of-border part will be colored by obstacle color.
obs_ratio – Obstacle ratio (how many obstacles will be in the map).
enable_path_checking – Enable path computation in map construction. Turn it down only when you are sure about valid maze.
random_action_mapping – Whether to enable random action mapping. If true, the result of performing every action will be shuffled. _checkingIf a single bool True is provided, we do random shuffle. Otherwise, it should be a list with same length as action space (5 when noaction enabled, 4 otherwise).
enable_noaction – Whether to enable no-action operation.
dense_reward – Whether the reward is dense.
reward_move – Reward for a valid move. For dense reward setting, it should be a positive number. While in sparse reward setting, it is expected to be a non-positive number.
reward_noaction – Reward for a no-action.
reward_final – Reward when you arrive at the final point.
reward_error – Reward when you perform an invalid move.
state_mode – State mode, either ‘DEFAULT’ or ‘RENDER’.
- __new__(**kwargs)#
- action(action)#
- append_stat(name, value)#
- clear_stats()#
- evaluate_one_episode(func)#
- finish(*args, **kwargs)#
- play_one_episode(func, ret_states=False, ret_actions=False, restart_kwargs=None, finish_kwargs=None, max_steps=10000)#
- property action_delta#
the tuple (dy, dx) when you perform action i
- Type:
Action deltas
- property action_mapping#
If random action mapping is enabled, return the internal mapping
- property action_space#
- property canvas#
Return the raw canvas (full)
- property canvas_size#
Canvas size
- property current_point#
Current point (r, c)
- property current_state#
- property distance_mat#
Distance matrix
- property distance_prev#
Distance-prev matrix
- property final_point#
Finish point (r, c)
- property inv_distance_mat#
- property inv_distance_prev#
- property lv_finals#
- property lv_obstacles#
- property lv_starts#
- property map_size#
Map size
- property obstacles#
- property origin_canvas#
Return the original canvas (at time 0, full)
- property quick_distance_mat#
this is done during the first run of SPFA, so if you ensure that all valid points are in the same connected component, you can use it
- Type:
Distance matrix
- property quick_distance_prev#
see also quick_distance_mat
- Type:
Distance-prev matrix
- property rewards#
A tuple of 4 value, representing the rewards for each action: (Move, Noaction, Arrive final point, Move Err)
- property shortest_path#
One of the shortest paths from start to finish, list of point (r, c)
- property start_point#
Start point (r, c)
- property stats#
- property unwrapped#
- property visible_size#
Visible size
Class MazeEnv
- class MazeEnv[source]#
Bases:
SimpleRLEnvBase
Create a maze environment.
- __init__(map_size=14, visible_size=None, obs_ratio=0.3, enable_path_checking=True, random_action_mapping=None, enable_noaction=False, dense_reward=False, reward_move=None, reward_noaction=0, reward_final=10, reward_error=-2, state_mode='DEFAULT')[source]#
- Parameters:
map_size – A single int or a tuple (h, w), representing the map size.
visible_size – A single int or a tuple (h, w), representing the visible size. The agent will at the center of the visible window, and out-of-border part will be colored by obstacle color.
obs_ratio – Obstacle ratio (how many obstacles will be in the map).
enable_path_checking – Enable path computation in map construction. Turn it down only when you are sure about valid maze.
random_action_mapping – Whether to enable random action mapping. If true, the result of performing every action will be shuffled. _checkingIf a single bool True is provided, we do random shuffle. Otherwise, it should be a list with same length as action space (5 when noaction enabled, 4 otherwise).
enable_noaction – Whether to enable no-action operation.
dense_reward – Whether the reward is dense.
reward_move – Reward for a valid move. For dense reward setting, it should be a positive number. While in sparse reward setting, it is expected to be a non-positive number.
reward_noaction – Reward for a no-action.
reward_final – Reward when you arrive at the final point.
reward_error – Reward when you perform an invalid move.
state_mode – State mode, either ‘DEFAULT’ or ‘RENDER’.
- __new__(**kwargs)#
- action(action)#
- append_stat(name, value)#
- clear_stats()#
- evaluate_one_episode(func)#
- finish(*args, **kwargs)#
- play_one_episode(func, ret_states=False, ret_actions=False, restart_kwargs=None, finish_kwargs=None, max_steps=10000)#
- property action_delta#
the tuple (dy, dx) when you perform action i
- Type:
Action deltas
- property action_mapping#
If random action mapping is enabled, return the internal mapping
- property action_space#
- property canvas#
Return the raw canvas (full)
- property canvas_size#
Canvas size
- property current_point#
Current point (r, c)
- property current_state#
- property distance_mat#
Distance matrix
- property distance_prev#
Distance-prev matrix
- property final_point#
Finish point (r, c)
- property inv_distance_mat#
- property inv_distance_prev#
- property map_size#
Map size
- property obstacles#
- property origin_canvas#
Return the original canvas (at time 0, full)
- property quick_distance_mat#
this is done during the first run of SPFA, so if you ensure that all valid points are in the same connected component, you can use it
- Type:
Distance matrix
- property quick_distance_prev#
see also quick_distance_mat
- Type:
Distance-prev matrix
- property rewards#
A tuple of 4 value, representing the rewards for each action: (Move, Noaction, Arrive final point, Move Err)
- property shortest_path#
One of the shortest paths from start to finish, list of point (r, c)
- property start_point#
Start point (r, c)
- property stats#
- property unwrapped#
- property visible_size#
Visible size