Spatial Audio Makes Robots Feel Warmer and Less Creepy

TL;DR

A new study finds spatial audio makes robots seem friendlier and less unsettling, improving human collaboration with no negative side effects.

As robots become more common in homes and workplaces, how they communicate with humans is a critical design . Sound offers a powerful but often overlooked channel for interaction, encompassing everything from the mechanical noises robots naturally make to intentionally designed auditory cues. A new study from Stanford University explores how different types of sounds affect human perception and behavior during human-robot collaboration, with a particular focus on a novel approach using spatial audio delivered through augmented reality headsets. suggest that carefully designed auditory interfaces can significantly enhance the user experience, making robots seem more approachable and improving task coordination without the drawbacks of traditional beeps and buzzes.

The researchers conducted a three-part experimental study involving 51 participants to investigate the effects of consequential, functional, and spatial sounds. Consequential sounds are the natural operational noises produced by a robot, such as the whirring of motors. In the first experiment, they tested whether these sounds negatively impacted perceptions of a Kinova Gen3 manipulator, a lightweight robotic arm suitable for household tasks. Surprisingly, showed no significant differences in how participants felt about the robot across four conditions: watching muted videos, videos with sound, being in person with reduced sound, or in person with full sound. This indicates that for quiet robots like the Kinova, consequential sounds do not inherently cause negative reactions, contradicting earlier hypotheses that such noises would lead to lower liking or reduced desire to be near the robot.

To understand how sound can be intentionally designed for better communication, the study compared two augmented sound conditions against a baseline of only consequential sounds. The functional sound condition used six distinct non-spatial sounds, similar to smartphone alerts, to signal robot states like startup, movement, and handover readiness. The spatial sound condition, implemented through audio augmented reality on a HoloLens 2 headset, featured a continuous calming guitar loop that moved through space to indicate the robot's gripper position in real-time. This approach combined functional information with transformative elements aimed at evoking positive emotions. Participants engaged in a collaborative Lego-building task with the robot under each sound condition, allowing the researchers to measure perceptions of warmth, competence, and discomfort using standardized surveys.

The data revealed nuanced insights into how these sound designs influence human-robot interaction. While statistical analysis showed no significant overall differences in warmth, competence, or discomfort across the three conditions, exploratory post-hoc comparisons highlighted emerging trends. Specifically, participants rated the spatial sound condition as warmer and less discomforting compared to the consequential sound condition, with a p-value of 0.042 for reduced discomfort. Qualitative feedback from post-experiment surveys, summarized in Table I, showed that participants associated the spatial sounds with conveying robot movement, directional cues, and a calming effect, whereas functional sounds were seen as more direct but sometimes abrupt or distracting. Overall, 56.1% of participants preferred the spatial sounds, 31.7% preferred the functional sounds, and only 12.2% preferred the consequential sounds alone, indicating a strong user preference for augmented auditory feedback.

These have important for the future of human-robot collaboration, especially as robots integrate into daily environments like homes and offices. The study found that safety-related information was the highest priority for participants when considering what robots should communicate through sound, as shown in Figure 6. Spatial audio, delivered via head-mounted augmented reality devices, offers a way to provide personalized, localized cues that minimize distraction to others and enhance situational awareness. For example, the spatial sound design helped participants better understand the robot's trajectory and readiness, potentially reducing accidents and improving task efficiency. The researchers suggest that such auditory interfaces could be particularly beneficial for mobile robots in larger workspaces, where directional cues might help humans anticipate and avoid collisions.

However, the study acknowledges several limitations that point to areas for future research. The sample size for some experiments was relatively small, with 41 participants in the collaborative task, which may have limited the detection of stronger effects. The use of a stationary manipulator in a confined workspace means that the benefits of spatial audio might be even more pronounced with mobile robots or in more dynamic environments. Additionally, individual differences in auditory sensitivity, such as preferences for pitch and volume, influenced perceptions, highlighting the need for adaptable sound designs that can cater to diverse users. Future work should explore how these sounds perform in multi-robot settings and across varying age groups and hearing abilities to ensure inclusivity and effectiveness in real-world applications.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn