TY - JOUR
T1 - Gaze-contingent auditory displays for improved spatial attention in virtual reality
AU - Vinnikov, Margarita
AU - Allison, Robert S.
AU - Fernandes, Suzette
N1 - Publisher Copyright: © 2017 ACM.
PY - 2017/4
Y1 - 2017/4
N2 - Virtual reality simulations of group social interactions are important for many applications, including the virtual treatment of social phobias, crowd and group simulation, collaborative virtual environments (VEs), and entertainment. In such scenarios, when compared to the real world, audio cues are often impoverished. As a result, users cannot rely on subtle spatial audio-visual cues that guide attention and enable effective social interactions in real-world situations. We explored whether gaze-contingent audio enhancement techniques driven by inferring audio-visual attention in virtual displays could be used to enable effective communication in cluttered audio VEs. In all of our experiments, we hypothesized that visual attention could be used as a tool to modulate the quality and intensity of sounds from multiple sources to efficiently and naturally select spatial sound sources. For this purpose, we built a gaze-contingent display (GCD) that allowed tracking of a user's gaze in real-time and modifying the volume of the speakers' voices contingent on the current region of overt attention. We compared six different techniques for sound modulation with a base condition providing no attentional modulation of sound. The techniques were compared in terms of source recognition and preference in a set of user studies. Overall, we observed that users liked the ability to control the sounds with their eyes. They felt that a rapid change in attenuation with attention but not the elimination of competing sounds (partial rather than absolute selection) was most natural. In conclusion, audio GCDs offer potential for simulating rich, natural social, and other interactions in VEs. They should be considered for improving both performance and fidelity in applications related to social behaviour scenarios or when the user needs to work with multiple audio sources of information.
AB - Virtual reality simulations of group social interactions are important for many applications, including the virtual treatment of social phobias, crowd and group simulation, collaborative virtual environments (VEs), and entertainment. In such scenarios, when compared to the real world, audio cues are often impoverished. As a result, users cannot rely on subtle spatial audio-visual cues that guide attention and enable effective social interactions in real-world situations. We explored whether gaze-contingent audio enhancement techniques driven by inferring audio-visual attention in virtual displays could be used to enable effective communication in cluttered audio VEs. In all of our experiments, we hypothesized that visual attention could be used as a tool to modulate the quality and intensity of sounds from multiple sources to efficiently and naturally select spatial sound sources. For this purpose, we built a gaze-contingent display (GCD) that allowed tracking of a user's gaze in real-time and modifying the volume of the speakers' voices contingent on the current region of overt attention. We compared six different techniques for sound modulation with a base condition providing no attentional modulation of sound. The techniques were compared in terms of source recognition and preference in a set of user studies. Overall, we observed that users liked the ability to control the sounds with their eyes. They felt that a rapid change in attenuation with attention but not the elimination of competing sounds (partial rather than absolute selection) was most natural. In conclusion, audio GCDs offer potential for simulating rich, natural social, and other interactions in VEs. They should be considered for improving both performance and fidelity in applications related to social behaviour scenarios or when the user needs to work with multiple audio sources of information.
KW - Gaze-contingent displays
KW - Sound modulation
KW - User experience
KW - Visual-audio attention
UR - http://www.scopus.com/inward/record.url?scp=85018743124&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85018743124&partnerID=8YFLogxK
U2 - https://doi.org/10.1145/3067822
DO - https://doi.org/10.1145/3067822
M3 - Article
SN - 1073-0516
VL - 24
JO - ACM Transactions on Computer-Human Interaction
JF - ACM Transactions on Computer-Human Interaction
IS - 3
M1 - 19
ER -