Learning Modes for Sequential Decision Making Using Stochastic Search
To develop plans and operate autonomously, robots need knowledge bases encoded in PDDL, the planning domain definition language. The purpose of this research is to explore learning methods to reduce the human supervision needed to acquire these knowledge bases. A reinforcement learning environment was developed to allow a learning agent to explore the meta-space of all possible knowledge bases. A reward signal based on a problem set evaluation method was created to aid the agent in learning a correct model. Further research will explore the use of heuristics and generalizations to improve the agent’s learning outcomes.