Developing trust in autonomous robots: Seminar with Michael Wagner
I recently had the opportunity to hear a talk from a colleague that I have worked with many times over the years. The talk was all about how to build safe and robust systems – a critical topic that we often pass over, but one that needs to be addressed if robots are to enter into the workforce alongside people.
Below are some key takeaway points from the talk (with some embellishment by me), followed by a video clip of the talk itself.
- One way to determine how close people can be to autonomous vehicle is to look at the speed with which you can stop the vehicle, and the distance traveled in that time. When looking at the time to stop the vehicle you need to look at human reaction time, signal latencies, and mechanical stopping time of vehicle.
- You can improve the above time by having an independent safety critical system onboard that evaluates safety at run-time and can stop vehicle faster than a human can. I really like this idea and have taken a similar (more basic) approach in the past
- IEC-61508 is a key safety standard for determining how safe a system needs to be. It provides guidance on what traits map to a given safety integrity level. However, dynamically changing parameters and AI generally make a system fall into the NOT safe categories.
- For safety critical systems we need to look at the worst case scenarios (and not the nominal or best case).
- We need to test, since some scenarios will violate assumptions that we did not even know that we made.
- “No amount of experimentation can ever prove me right; a single experiment can prove me wrong” – Albert Einstein
- Software cares about the range of the inputs and not necessarily about the duration that you run the code for. (Duration can also matter and needs to be tested for memory usage, and other unexpected parameters that will only show up after code has been running for awhile.)
- The value of a test is not the test, but how you improve the system and learn from it
- Field testing is very important. Field test till errors diminish. Then simulate system for the “next million miles.”
- The question becomes how to simulate and error check the code. One method is with robustness testing, which lets the software use automated methods to find the problems, so that the developers know what to fix.
- With robustness testing you can apply random inputs to the system and see what causes it to crash or hang. This lets you simulate inputs and not just scenarios.
- You can also probabilistically modify the real world inputs that are entering your system, and verify the outputs.
- Use system requirements to generate safety rules that are checked in real-time. The great thing about safety rules is that if a rule ever fails, you know something is wrong, and the code/test/robot can stop.
- You can also add a dictionary test on your code, where you control the inputs to see what the outputs will be. You can build a dictionary over time so that as you continue testing the system you do not reintroduce old errors that you have seen already.
- Temporal logic is useful to create rules such as “get a message within x seconds”; as opposed to “a message must respond eventually”. Tests need to reflect that.
Things that commonly fail testing:
- Improper handling of floating point numbers (out-of-bounds, NaN, infinite (inf), etc.
- Array indexing
- Memory issues. Leaking, too much specified, not enough specified, buffer overflows, etc.
- Time going backwards and old/stale data (I have seen this error more times than I can remember)
- Problems handling dynamic state
And now the part you have been waiting for:
If you liked this article, you may also be interested in:
- Programming safety into self-driving cars
- Safety concerns for collaborative robots
- ShanghAI Lectures: Weidong Chen “Enhancing Autonomy and Safety of Assistive Robots”
- Quadrocopter failsafe algorithm: Recovery after propeller loss