Robohub.org
 

Noisy imitation speeds up group learning


by
03 December 2013



share this:

Broadly speaking there are two kinds of learning: individual learning and social learning. Individual learning means learning something entirely on your own, without reference to anyone else who might have learned the same thing before. The flip side of individual learning is social learning, which means learning from someone else. We humans are pretty good at both individual and social learning although we very rarely have to truly work something out from first principles. view_finder_two_robots_binocularMost of what we learn, we learn from teachers, parents, grandparents and countless others. We learn everything from how to make chicken soup to flying an aeroplane from watching others who already know the recipe (or wrote it down), or have mastered the skill. For modern humans I reckon it’s pretty hard to think of anything we have truly learned, on our own; maybe learning to control our own bodies as babies, leading to crawling and walking are candidates for individual learning (although as babies we are surrounded by others who already know how to walk – would we walk at all if everyone else got around on all fours?). Learning to ride a bicycle is perhaps also one of those things no-one can really teach you – although it would be interesting to compare someone who has never seen a bicycle, or anyone riding one, in their lives with those (most of us) who see others riding bicycles long before climbing on one ourselves.

In robotics we are very interested in both kinds of learning, and methods for programming robots that can learn are well known. A method for individual learning is called reinforcement learning (RL). It’s a laborious process in which the robot tries out lots and lots of actions and gets feedback on whether each action helps or hinders the robot in getting closer to its goal – actions that help/hinder are re/de-inforced so the robot is more/less likely to try them again; it’s a bit like shouting “warm, hot, cold, colder…” in a hide-and-seek game. It’s fair to say that RL in robotics is pretty slow; robots are not good individual learners, but that’s because, in general, they have no prior knowledge. As a fair comparison think of how long it would take you to learn how to make fire from first principles if you had no idea that getting something hot may, if you have the right materials and are persistent, create fire, or that rubbing things together can make them hot. Roboticists are also very interested in developing robots that can learn socially, especially by imitation. Robots that you can program by showing them what to do (called programming by demonstration) clearly have a big advantage over robots that have to be explicitly programmed for each new skill.

Within the artificial culture project PhD student (now Dr) Mehmet Erbas developed a new way of combining social learning by imitation and individual reinforcement learning, and the paper setting out the method together with results from simulation and real robots has been published in the journal Adaptive Behavior. Let me explain the experiments with real robots, and what we have learned from them.

Here’s our experiment. We have two robots – called e-pucks. The inset shows a closeup. Each robot has its own compartment and must – using individual (reinforcement) learning – learn how to navigate from the top right hand corner, to the bottom left hand corner of its compartment. Learning this way is slow, taking hours. But in this experiment the robots also have the ability to learn socially, by watching each other. Every so often one of the robots will stop its individual learning and drive itself out of its own compartment, to the small opening at the bottom left of the other compartment. There it will stop and simply watch the other robot while it is learning, for a few minutes. Using a movement imitation algorithm the watching robot will (socially) learn a fragment of what the other robot is doing, then combine this knowledge into what it is individually learning. The robot then runs back to its own compartment and resumes its individual learning. We call the combination of social and individual learning ‘imitation-enhanced learning’.

In order to test the effectiveness of our new imitation-enhanced learning algorithm we first run the experiment with the imitation turned off, so the robots learn only individually. This gives us a baseline for comparison. We then run two experiments with imitation-enhanced learning. In the first we wait until one robot has completed its individual learning, so it is an ‘expert’; the other robot then learns – using its combination of individual learning and social learning from the expert. Not surprisingly, learning this way is faster.

This graph shows individual learning only as the solid black line, and imitation-enhanced learning from an expert as the dashed line. In both cases learning is more or less complete when the graphs transition from vertical to horizontal. We see that individual learning takes around 360 minutes (6 hours). With the benefit of an expert to watch, learning time drops to around 60 minutes.

The second experiment is even more interesting. Here we start the two robots at the same time, so that both are equally inexpert. Now you might think it wouldn’t help at all, but remarkably each robot learns faster when it can observe, from time to time, the other inexpert robot, than when learning entirely on its own. As the graph below shows, the speedup isn’t as dramatic – but imitation enhanced learning is still faster.

Think of it this way. It’s like two novice cooks, neither of whom knows how to make chicken soup. Each is trying to figure it out by trial and error but, from time to time, they can watch each other. Even though it’s pretty likely that each will copy some things that lead to worse chicken soup, on average and over time, each hapless cook will learn how to make chicken soup a bit faster than if they were learning entirely alone.

In the paper we analyse what’s going on when one robot imitates part of the semi-learned sequence of moves by the other. And here we see something completely unexpected. Because the robots imitate each other imperfectly – when one robot watches another and then tries to copy what it saw, the copy will not be perfect – from time to time, one inexpert robot will miscopy the other inexpert robot and the miscopy, by chance, helps it to learn. To use the chicken soup analogy: it’s as if you are spying on the other cook – try to copy what they’re doing but get it wrong and, by accident, end up with better chicken soup.

This is deeply interesting because it suggests that when we learn in groups, making mistakes – noisy social learning – can actually speed up learning for each individual and for the group as a whole.

Full Reference
Mehmet D Erbas, Alan FT Winfield, and Larry Bull (2013), Embodied imitation-enhanced reinforcement learning in multi-agent systems, Adaptive Behavior. Published online 29 August 2013Download pdf (final draft)



tags: , , ,


Alan Winfield is Professor in robotics at UWE Bristol. He communicates about science on his personal blog.
Alan Winfield is Professor in robotics at UWE Bristol. He communicates about science on his personal blog.





Related posts :



Robot Talk Episode 115 – Robot dogs working in industry, with Benjamin Mottis

  28 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Benjamin Mottis from ANYbotics about deploying their four-legged ANYmal robot in a variety of industries.

Robot Talk Episode 114 – Reducing waste with robotics, with Josie Gotz

  21 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Josie Gotz from the Manufacturing Technology Centre about robotics for material recovery, reuse and recycling.

Robot Talk Episode 113 – Soft robotic hands, with Kaspar Althoefer

  14 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Kaspar Althoefer from Queen Mary University of London about soft robotic manipulators for healthcare and manufacturing.

Robot Talk Episode 112 – Getting creative with robotics, with Vali Lalioti

  07 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Vali Lalioti from the University of the Arts London about how art, culture and robotics interact.

Robot Talk Episode 111 – Robots for climate action, with Patrick Meier

  28 Feb 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Patrick Meier from the Climate Robotics Network about how robots can help scale action on climate change.

Robot Talk Episode 110 – Designing ethical robots, with Catherine Menon

  21 Feb 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Catherine Menon from the University of Hertfordshire about designing home assistance robots with ethics in mind.

Robot Talk Episode 109 – Building robots at home, with Dan Nicholson

  14 Feb 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Dan Nicholson from MakerForge.tech about creating open source robotics projects you can do at home.

Robot Talk Episode 108 – Giving robots the sense of touch, with Anuradha Ranasinghe

  07 Feb 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Anuradha Ranasinghe from Liverpool Hope University about haptic sensors for wearable tech and robotics.





Robohub is supported by:




Would you like to learn how to tell impactful stories about your robot or AI system?


scicomm
training the next generation of science communicators in robotics & AI


©2024 - Association for the Understanding of Artificial Intelligence


 












©2021 - ROBOTS Association