In this episode, Abate De Mey interviews Guido De Croon about Evolutionary Robotics and its use to design behaviors for flying robots. They discuss a recent paper by Kirk Schepe et al., in which the DelFly UAV robot learns to fly through an open window when trapped inside a room thanks to a controller optimized using a Genetic Algorithm. The controller is programmed using a Behavior Tree Framework, which is more intuitive and adaptable than the traditional Neural Network framework. This helps the user to manually adapt the controller to handle the differences between the simulation and the real world. They go on to discuss the challenges and benefits of using Evolutionary Robotics to learn robot behaviors.
Video of the Evolutionary Robotics strategy used to develop a controller for the DelFly robot:
Guido De Croon
Guido de Croon is Assistant Professor at the Micro Air Vehicle lab of Delft University of Technology in the Netherlands. His research interest lies with computationally efficient algorithms for robot autonomy, with a particular focus on computer vision and evolutionary robotics.
This transcript has been edited for clarity.
Abate De Mey: Welcome to the Robots Podcast. Could you please introduce yourself to our listeners?
Guido De Croon: Hi, I’m Guido De Croon and I’m assistant professor in the Micro Air Vehicle Lab of TU Delft in the Netherlands. I work on the artificial intelligence of tiny robots.
Abate De Mey: You also do work with evolutionary robotics. Can you talk about evolutionary robotics, and what are some real world applications of them?
Guido De Croon: Evolutionary robotics is a domain in which we chase the following dream: we would like to have a general method that if we give a task for a robot, we only need to come up with the fitness function. A way of saying how good a robot is doing the task. Then, there’s an artificial evolutionary process that takes over the design work and evolves the robots, like the body, sensors, actuators and the controller do this task.
Abate De Mey: What are some things that you optimize using evolutionary robotics tactics?
Guido De Croon: Typically, it’s a task you want the robot to do. You come up with a fitness function, which is actually a difficult part of evolutionary robotics. In the last few years, we’ve also seen special fitness functions like rewarding robots to do novel stuff. We call this novelty search. In my case, I’ve been evolving robots to do different tasks, such as finding odor sources. Recently, I’ve been evolving small robots to perform the task of flying in a room, avoiding obstacles, finding a window and flying through the window.
Abate De Mey: What are some limitations of evolutionary robotics?
Guido De Croon: I don’t think there are fundamental limitations to evolutionary robotics because we will have technologies in the end to perform an actual evolution with it. I think the major limitation at the moment is that if you want to do evolution on real robots, it will happen in real time, which means it can actually take a very long time. The worst case it really takes as long as a real evolution. You can imagine that … Especially if we want a complex robot to do a complex task, then this might take a really long time.
That’s why most people in evolutionary robotics evolve robots in simulation. You have a simulation with your robot in it. It has a task to do. You can test out thousands of what we call individuals to different robot solutions. You can have the best ones procreate and create offspring. Then at the end of this artificial evolutionary process and simulation, you need to transfer the best solution to the real world. This is a difficult step because simulation goes fast also because you don’t model everything that’s happening in the real world. The real robot you use is typically different than the one that was used in simulation, so this is a challenge.
There’s difference between the simulation and reality in the field of evolutionary robotics. It’s called the reality gap. Crossing the reality gap is another major challenges at the moment.
The second limitation that’s there is the complexity of the task it outperforms is pretty limited. We have really quite studies on finding light or avoiding obstacles, things that can be done with simple controllers. The question is: how well the approach scales also to more complex problems? Finding other sources in turbulent conditions or really navigating in an environment, being able to find something and then return to the home location.
Abate De Mey: What are some tactics you use to overcome the reality gap?
Guido De Croon: There are different tactics that have been developed. One way is to do an evolution in simulation, then taking the solution and continuing the evolution on real robots. You perform a lot of the evolution in simulation, and then just perform final adaptations on the platform. Still, the real robot may be quite different. It may take generations before you get the right solution. This way of handling the reality gap still requires a long time.
There are other ways though. There are ways in which the mismatch between simulation and reality is modeled during evolution. What they do then is once every generation, for example, one in every hundred generations, they try out an individual on real robots. They take the difference in the performance. They try to model this in the simulation and try to adapt the simulation and predict if solutions will work in the real world. That’s another approach.
I think that the final approach, which may be very successful, is to combine evolutionary learning, which is a slow process, with developmental capabilities. They call this evo-devo. It means that you evolve the robot and its controller, but you also evolve adaptive mechanisms that will allow the robot to also learn online, so during operation.
Abate De Mey: Have you implemented evo-devo before in a project?
Guido De Croon: Yes, some while ago. Until now it’s generally acknowledged that this will be important, but the golden bullet there has not been discovered yet. I do think that in the future this is promising.
The question why not is perhaps a good one as well. Why didn’t we try that yet? Also, because the learning mechanism has to be really fast. In my case, I focus on robots such as small flying robots. Suppose you combine evolutionary learning with reinforcement learning, still the robot will have to do reinforcement learning. Also reinforcement learning itself can take quite some time. It involves risks for the robot. Then you don’t really solve your problem. You really need to find the right combinations of developmental algorithms and evolutionary algorithms.
Abate De Mey: Could you tell us about a project where you have implemented evolutionary robotics?
Guido De Croon: Yes. I’ve implemented the evolutionary robotics in quite a few projects. I think one of the ones that’s perhaps a very good example is recently we implemented evolutionary robotics for our tiny flying robot called the DelFly Explorer. This is a dragonfly-like robot. It weighs 20 grams. In those 20 grams, we have a stereo vision system so two cameras and a small processor with which the DelFly can see obstacles and can do obstacle avoidance.
Now, we developed this DelFly Explorer a few years ago. At the time, we showed that it could avoid obstacles, fly around completely by itself, without any human intervention or without any off-board computer. Our PhD student Sjoerd Tijmons, he made what I think is a very elegant algorithm to control the robot. The main idea behind the algorithm is that the DelFly always keeps a space in front of it, a free space, large enough to make a turnaround maneuver. If it sees an obstacle coming into the zone, it will do this prefixed maneuver. It always turn in front of the obstacle and then it can circle as long as it wants until it finds a new flight direction. This, I think, can be compared a bit with some insect behaviors that you see in nature. We were, of course, very happy with it.
Then we thought now we want to do another task so so we want, for example, our DelFly to fly through a window, to fly in a room, find a window and fly through it. Do we then have to search another Sjoerd, another PhD student again to come up with another very nice algorithm? The answer actually is no. It will be great if we can give this task to an evolutionary process and that this evolutionary process finds a solution that’s, not only effective and robust, but also as computationally as efficient as the solution we found ourselves for obstacle avoidance. This is very important, for example, for these small flying robots because the processor that we have onboard, for example, it has only a 192 kilobytes of memory.
The nice thing about evolutionary robotics is that you can beforehand define the structure that evolution has to evolve, and you can ensure that it’s computationally efficient enough. If it then finds a solution, then it fits within the constraints of the onboard platform. This project we did last year, it gave us really some insight into this matter and into how to cross the reality gap.
Abate De Mey: How successful was the DelFly, which was optimized with the evolutionary robotics framework, at flying through the window?
Guido De Croon: That’s, of course, the most important question. The approach we took is quite different from what is typically done in evolutionary robotics. In evolutionary robotics it’s, of course, bio-inspired, inspired by natural evolution. This also typically implies that neural networks are used as the robot’s controller because this is also bio-inspired. It’s parallel. Of course, neural networks are very capable, especially also the deep neural networks. The problem of neural networks in evolutionary robotics is that it’s pretty hard to understand what evolution came up with. I’m not saying it’s impossible, not at all.
For multiple tasks, I’ve evolved neural networks, for example, also for odor source finding. When you do that afterwards, you can study it. You have the neural networks. You can, for example, clamp neurons to a certain value, or you can influence the input that the robot has, and you can analyze what happens then to the actions. Actually you can understand what it’s doing, what the strategy is.
When I was doing a study on odor source localization, I had robots in simulation performing really well. I tried to understand the networks. Then I was like, “When can you say that you really understand this?” In order to test whether I understood the strategy, I re-implement the strategy that evolution found in the neural network, and I re-implemented it as a finite-state machine. Then I ran the finite-state machine, and it got a similar performance as the original robot.
Then I thought, “Hey, this finite-state machine, if you look at it, you understand immediately what the robot is doing.” Suppose that you have this type of controller and you put it on a real robot, if something is not working correctly, then you can probably understand what is going wrong and adapt it yourself. That’s why when we wanted to evolve a controller for the real DelFly, we took a similar kind of controller. Instead of a finite-state machine, we took a behavior tree.
Behavior tree is a controller that is used a lot in the gaming industry because it’s pretty intuitive. In comparison with finite-state machines, it does not have what we call a state explosion. When you add complexity to the controller, it doesn’t immediately lead to a huge explosion of the controller structure. With the behavior tree, you can solve complex problems and the controller will still look pretty compressed, so pretty small.
In our case, for the DelFly flying through the window, we evolve the behavior trees. Evolution came up with a very elegant solution that was smaller than the hand designed one that we made ourselves and that had a higher success rate in simulation. Then, of course, the question was if we have this behavior tree, can we now understand it? Will this help us if we go to the real world?
We took the controller from simulation and put it into the real DelFly. Of course, the first time we flew it or the first few times, it would just crash against the wall. That was not very nice. Since we could easily see in what part of the tree it was active, we could easily adapt some parameters in the behavior tree to really boost the performance and get it more similar to the simulation performance.
Your original question was how successful was it? Well, in simulation it would get through the window 90% of the time. After tuning it on the real robot, it only flew through 53% of the time or 55. Don’t test me about the exact number. That’s, of course, much, much lower and really not satisfactory. The reason was that the robot which was evolved in simulation would, let’s say, take a bit more risk. It would fly very close to the window border which in simulation didn’t matter. In the real world, we were testing in a pretty artificial environment so a small room that we built indoors with window in it. It turned out that even in this indoor space, there were drafts and these drafts were coming through the window, and they were pushing the DelFly away from the window basically. If you then have a risky strategy of going a bit close to the border and you’re pushed away by the draft, then you actually hit the window border a bit more often than in simulation.
It taught us, yes, these behavior trees, they’re easier to understand. You can then manually adapt them for the real robot, but it doesn’t solve the reality gap completely which, of course, would have been almost too easy, of course, if that would have been the case. The robot exploited the fact that there was no wind in the simulation environment. Currently, we added, of course, wind to the simulation environment, and we’re re-evolving and seeing how that goes. It’s not the full answer to this problem, but it is a very promising step.
Abate De Mey: How adaptable would the controller that you created with evolutionary robotics be to a room with different parameters, different dimensions?
Guido De Croon: That’s also a very good question. Let’s say, partly a different room would be no problem. For example, our DelFly uses stereo vision. If you change the texture in the room, it would be okay. If you would make the walls completely textureless, that could be a problem actually. Then you would need other visual inputs, for example.
In this specific article that we wrote on this study, we only use one size of room. I think it was square and then it would fly through the window. One of the pitfalls of evolutionary robotics is that the solution that you evolved in simulation will exploit everything it can from that environment. If you only evolve in a square room, then it’s possible that it will generalize rectangular rooms. It’s also possible that your solution will be less good. You have to vary the parameters in simulation that you think will vary in reality. That’s, by the way, another thing that we’re doing at the moment for this task.
I think that there’s hardly an answer to that because I think the real answer would be to have the robot also develop by itself in the real world afterwards. There’s still a few other strategies. For example, one of the problems we had in this first study was that in simulation, we had the robot just set a rudder command. No, sorry. An aileron command. That means that it would directly set the ailerons of the robot. If you do that in real life, then this never gives the same result because, for example … Ailerons are like little things behind the wings that make it turn around, that make it change heading.
These ailerons on the real robot are already a bit asymmetric, for example. In simulation, we didn’t have a very accurate model not at all of the DelFly. If you would give an input of plus one, it would turn, for example, with I don’t know, some degrees per second to the right. Minus one would be the exact opposite reaction. In reality, this is not the case. It will turn perhaps easier to the right than to the left. It can even depend on environmental conditions whether it’s really turning with, I don’t know, 10 degrees per second or not.
One of the things we did now as well is to abstract away from those very low level commands and to take a bit higher level. The robot now sets, for example, a turn rate. It has a lower level controller that can actually execute this turn rates. In that sense, you also narrow the reality gap. We’re searching through all these kinds of possibilities and see how far we will get with that.
Abate De Mey: What are the largest hurdles that evolutionary robotics has to overcome to have widespread usage?
Guido De Croon: I think one is this reality gap that I’ve been talking about. One of the main reasons to use evolutionary robotics, especially for, for example, small robots, is that with fairly limited resources, computational, sensory, you’re still able to perform complex tasks. The way in which they do this is typically pretty ingenious. For example, we were talking about this odor source localization or if you think about fruit flies in your kitchen, they always find the … Well, at least in my kitchen, they do. They always find the bananas and the apples, especially in the summer. They really invade my kitchen. They solve this problem of finding the banana mostly by using the scent.
When you look at roboticist, then some of the robotic solutions are pretty complex. What they do is, for example, in the robots, the robot in its mind, it’s going to simulate how odor spreads through turbulent air. When it smells some odor, it will run some probabilistic dynamical model to see what’s the highest probability in its environment are to contain the source of the odor. That’s one approach. It’s computationaly very complex.
If you look at fruit flies, then they actually just alternate between two behaviors. If they don’t smell any odor, they will do casting, which means that they fly into the wind. If they do then smell something, then they will start surging, which means that they move upwind. When they lose the odor again, they will start casting again. This, of course, a very rough description of their behavior. By alternating between these two behaviors, in the end, they find the fruit. They do this in a computationally efficient way.
Now, what we want with evolutionary robotics is for any given task, for any given robot to be able to find such very smart strategies that work for the particular robots. If you want evolutionary robotics to be widely used, it really needs to work on the real robots. You need to cross the reality gap and the challenge then is not only to cross it, but to cross it and still keep this ability for finding these elegant, sensory motor solutions. That’s one large hurdle.
I think the second hurdle, I think I also mentioned it, the complexity of the task. It’s, of course, nice to have a robot that goes around and avoid obstacles. How can we evolve robots to do all kinds of tasks? The robot, for example, has to fly around in a greenhouse, has to deposit some larvae on plants or something and at the same time look for ripe fruits, come back to its nest to recharge, collaborate with the other drones in the greenhouse in order not to fly in each other’s way and to effectively cover all the area in the greenhouse. How do you use an evolutionary algorithm to come at such a complex solution?
I think also there’s a huge challenge. At the moment, we’re looking at these behavior trees also for this reason because behavior trees, they allow for certain sub behaviors to be very easily reused as a sub behavior in another behavior tree. I don’t know if this, of course, will be a step towards such a solution. I think that’s a large hurdle as well because if you wanted to be widespread, it needs to work on the actual robot, and the robots need to do useful and sometimes complex tasks.
Abate De Mey: Are evolutionary algorithms a one size fits all strategy for creating robot behaviors?
Guido De Croon: Not yet but that’s the the dream, Abate. We’re trying to replace ourselves as roboticist. That’s the dream that you can really give us a task. We just put the same evolutionary algorithm at work. It will be great, of course, in the future if you can evolve both the body, and the sensors, and the actuators, and the brain to come up with the ideal solution for this task.
Abate De Mey: How can our listeners learn more about evolutionary robotics?
Guido De Croon: That’s a good question. There is a book that was written a while ago by Stefano Nolfi and Dario Floreano. They talked about the earlier work in this area. I think there’s really some exciting work going on at the moment in evolutionary robotics. The book is a good starting point. More recent work, I would say Google is your friend.
Abate De Mey: Well, thank you very much for coming in today and discussing evolutionary robotics with us, Guido.
Guido De Croon: You’re welcome. It was my pleasure.