New method enables high quality speech separation

Researchers have developed a novel audio-visual model for isolating and enhancing the speech of desired speakers in a video. The team’s deep network-based model incorporates both visual and auditory signals in order to isolate and enhance any speaker in any video, even in challenging real-world scenarios, such as video conferencing, where multiple participants oftentimes talk at once, and noisy bars, which could contain a variety of background noise, music, and competing conversations.