Computational approaches for describing real-world spatial sound scenes
Sound is rich with information about the surrounding environment. If you stand on a city sidewalk with your eyes closed and listen, you will hear the sounds of events happening around you: birds chirping, squirrels scurrying, people talking, doors opening, an ambulance speeding, a truck idling. In addition, you will also likely be able to perceive the location of each sound source, where it’s going, and how fast it’s moving. This project will build innovative technologies to allow computers to extract this rich information out of sound. By not only identifying which sound sources are present but also estimating the spatial location and movement of each sound source, sound sensing technology will be able to better describe our environments with microphone-enabled everyday devices, e.g. smartphones, headphones, smart speakers, hearing-aids, home camera, and mixed-reality headsets. For hearing impaired individuals, the developed technologies have the potential to alert them to dangerous situations in urban or domestic environments. For city agencies, acoustic sensors will be able to more accurately quantify traffic, construction, and other activities in urban environments. For ecologists, this technology can help them more accurately monitor and study wildlife. In addition, this information complements what computer vision can sense, as sound can include information about events that are not easily visible, such as sources that are small (e.g., insects), far away (e.g., a distant jackhammer), or simply hidden behind another object (e.g., an incoming ambulance around a building’s corner).