In the future, robots will be expected to learn a task and execute it in a variety of realistic situations. Reinforcement-learning and planning algorithms are exactly intended for that purpose. However, one of the main challenges is to make sure actions learned in one environment can be used in new and unforeseen situations in real time.
To address this challenge, Stolle et al. have imagined a series of algorithms which they demonstrate on complex tasks such as solving a marble maze or making Boston Dynamic’s Little Dog navigate over complex terrain (see video below).
The first ingredient of success relies on making robots learn what action to take based on local features, meaning features as viewed by the robot (e.g. “there is a wall to the right”). These local features can then be recognized in new environments when the robot is in similar situations. Instead, many existing algorithms use global information, for example by saying “perform this action in position (x,y,z)”. Changing the environment however would typically make these global policies useless.
The second ingredient makes robots build libraries containing sequences of actions (trajectories) that can bring a robot from its current state to an aimed goal. Robots then apply the actions from the trajectory nearest to their state to achieve a task. This strategy is interesting because it is not computationally expensive and does not require large amounts of fast memory.
Finally, don’t miss the following video of little-dog climbing over a fence. This special purpose behavior can be used in a variety of situations.