Readers of this blog will know that I’ve become very excited by the potential of robots with simulation-based internal models in recent years. So far we’ve demonstrated their potential in simple ethical robots and as the basis for rational imitation. Our most recent publication instead examines the potential of robots with simulation-based internal models for safety. Of course it’s not hard to see why the ability to model and predict the consequences of both your own and others’ actions can help you to navigate the world more safely than without that ability.
Our paper Simulation-Based Internal Models for Safer Robots demonstrates the value of anticipation in what we call the corridor experiment. Here a smart robot (equipped with a simulation based internal model which we call a consequence engine) must navigate to the end of a corridor while maintaining a safe space around it at all times despite five other robots moving randomly in the corridor – in much the same way you and I might have to navigate down a busy office corridor while others are coming in the opposite direction.
Here is the abstract from our paper:
In this paper, we explore the potential of mobile robots with simulation-based internal models for safety in highly dynamic environments. We propose a robot with a simulation of itself, other dynamic actors and its environment, inside itself. Operating in real time, this simulation-based internal model is able to look ahead and predict the consequences of both the robot’s own actions and those of the other dynamic actors in its vicinity. Hence, the robot continuously modifies its own actions in order to actively maintain its own safety while also achieving its goal. Inspired by the problem of how mobile robots could move quickly and safely through crowds of moving humans, we present experimental results which compare the performance of our internal simulation-based controller with a purely reactive approach as a proof-of-concept study for the practical use of simulation-based internal models.
So, does it work? Thanks to some brilliant experimental work by Christian Blum the answer is a resounding yes. The best way to understand what’s going on is with this wonderful gif animation of one experimental run below. The smart robot (blue) starts at the left and has the goal of safely reaching the right hand end of the corridor – its actual path is also shown in blue. Meanwhile 5 (red) robots are moving randomly (including bouncing off walls) and their actual paths are also shown in red; these robots are equipped only with simple obstacle avoidance behaviours. The larger blue circle shows blue’s ‘attention radius’ – to reduce computational effort blue will only model red robots within this radius. The yellow paths in front of the red robots in blue’s attention radius show blue’s predictions of how those robots will move (taking into account collisions with the corridor walls and with blue and each other). The light blue projection in front of blue shows which of the 34 next possible actions of blue that is internally modelled is actually chosen as the next action (which, as you will see, sometimes includes standing still).
What do the results show us? Christian ran lots of trials – 88 simulations and 54 real robot experiments – over four experiments: (1) the baseline in simulation – in which the blue robot has only a simple reactive collision avoidance behaviour, (2) the baseline with real robots, (3) using the consequence engine (CE) in the blue robot in simulation, and (4) using the consequence engine in the blue robot with real robots. In the results below (a) shows the time taken for the blue robot to reach the end of the corridor, (b) shows the distance that the blue robot covers while reaching the end of the corridor, (c) shows the “danger ratio” experienced by the blue robot, and (d) shows the number of consequence engine runs per timestep in the blue robot. The danger ratio is the percentage of the run time that anther robot is within the blue robot’s safety radius.
For a relatively small cost in additional run time and distance covered, panels (a) and (b), the danger ratio is very significantly reduced from a mean value of ~20% to a mean value of zero, panel (c). Of course there is a computational cost, and this is reflected in panel (d); the baseline experiment has no consequence engine and hence runs no simulations, whereas the smart robot runs an average of between 8 and 10 simulations per time-step. This is exactly what we would expect: predicting the future clearly incurs a computational overhead.