One of the latest papers in the Journal Autonomous Robots presents ManyEars, an open framework for robot audition.
Making robots that are able to localize, track and separate multiple sound sources, even in noisy places, is essential for their deployment in our everyday environments. This could for example allow them to process human speech, even in crowded places, or identify noises of interest and where they came from. Unlike vision however, there are few software and hardware tools that can easily be integrated to robotic platforms.
The ManyEars open source framework allows users to easily experiment with robot audition. The software, which can be downloaded here, is compatible with ROS (Robot Operating System). Its modular design makes it possible to interface with different microphone configurations and hardware, thereby allowing the same software package to be used for different robots. A Graphical User Interface is provided for tuning parameters and visualizing information about the sound sources in real-time. The figure below illustrates the architecture of the software library. It is composed of five modules: Preprocessing, Localization, Tracking, Separation and Postprocessing.
To make use of the software, a computer, a sound card and microphones are required. ManyEars can be used with commercially available sound cards and microphones. However, commercial sound cards present limitations when used for embedded robotic applications: they can be expensive and have functionalities which are not required for robot audition. They also require significant amount of power and size. For these reasons, the authors introduce a customized microphone board and sound card available as an open hardware solution that can be used on your robot and interfaced with the software package. The board uses an array of microphones, instead of only one or two, thereby allowing a robot to localize, track, and separate multiple sound sources.
The framework is demonstrated using the microphones array on the IRL-1 robot shown below. The placement of the microphones is marked by red circles. Results show that the robot is able to track two human speakers producing uninterrupted speech sequences, even when they are moving, and crossing paths. For videos of the IRL-1, check out the lab’s YouTube Channel.
For more information, you can read the following paper:
The ManyEars open framework, F. Grondin, D. Létourneau, F. Ferland, V. Rousseau, F. Michaud, Autonomous Robots – Springer US, Feb 2013