A computer-implemented method for operating a haptic device, the haptic device comprising a plurality of tactile displays configured to provide haptic stimuli to a user, the method including the steps of (a) processing an audio signal derived from an audio ...
Performing speaker diarization while uniquely identifying the speakers in a collection of audio recordings is a challenging task. Based on our previous work on speaker diarization and linking, we developed a system for diarizing longitudinal TV show data s ...
The integration of audio and visual information improves speech recognition performance, specially in the presence of noise. In these circumstances it is necessary to introduce audio and visual weights to control the contribution of each modality to the re ...
We revisit the problem of blocking artifacts and their suppression in generic frame-based speech/audio applications. We provide a perceptual characterization of the artifacts by using dynamic auditory models. We propose some short-time-Fourier-transform-ba ...
Audio-visual speech recognition promises to improve the performance of speech recognizers, especially when the audio is corrupted, by adding information from the visual modality, more specifically, from the video of the speaker. However, the number of visu ...
For parametric stereo and multi-channel audio coding, it has been proposed to use level difference, time difference, and coherence cues between audio channels to represent the perceptual spatial features of stereo and multi-channel audio signals. In practi ...