Constrained Learning from Demonstration

Injecting abstract behavioral constraints into Learning from Demonstration techniques for faster, safer, and more reliable robot skill learning.

by Carl Mueller on May 02, 2019

Carl’s research focuses on developing new techniques for non-expert users to communicate complicated and abstract information to robotic learning systems. This research predominantly focuses on Learning from Demonstration (LfD): a set of techniques that enable human users to teach a robot how to perform a task without the need for actual programming knowledge. Traditionally, such techniques utilize teleoperation, kinesthetic learning, or imitation learning, most of which record low-level data such as end-effector position and robot configuration. These low-level state spaces have limited bandwidth to capture many of the important factors and abstract concepts relevant to successful skill learning and execution. This is in stark contrast to how human beings teach other human beings. For example, a parent might teach a child to ride a bike by demonstration, but they likely utilize some combination of language, gesturing, object signalling, and positive or negative feedback etc., to convey higher-level concepts.

Unfortunately, most robotic learning systems do not yet utilize such information and instead rely on the low-level data mentioned above. It’s as though to teach someone how to pedal a bike, we’ve opted to physically move their legs in a pedaling motion. However, advances in natural language understanding, augmented reality interfaces, task and motion planning, and machine learning all help to enrich LfD with multi-modal techniques. In his first paper, Carl introduced the idea of a ‘concept constraint’ that represents abstract restrictions on the behavior of the robot essential to successful skill learning and execution (e.g. keeping a cup upright until over the target). Encoded as boolean operators, these constraints can be communicated via natural language and evaluated given the state of the environment thereby providing more sophisticated information alongside demonstration data. The result is a more efficient robot learner capable of safer, more robust, and trustworthy task execution.