Sound
AI Learns to Balance Speech Sounds and Emotion for Natural Voices
A new training method helps AI generate more natural-sounding speech by teaching it to distinguish between different sounds while capturing emotional nuances, outperforming existing approaches in clarity and speaker similarity.
Better Data, Not More Data, Drives AI Audio Breakthrough
A new study shows that high-quality labels, not massive datasets, are key to training AI that understands speech, music, and sounds—outperforming models with five times more data.
AI Gains a Better Ear for Real Conversations
A new open-source pipeline processes messy, overlapping speech to train AI assistants that can listen and talk simultaneously, making human-computer interaction more natural.
AI Models Struggle to Read Music Like Humans Do
A new benchmark reveals that large language and vision models fail to understand complete musical scores, with visual recognition performing especially poorly, but a text-based notation offers a path forward.
Robots Sound Better When They Use Spatial Audio
A new study shows that spatial audio can make robots seem warmer and less discomforting, enhancing human collaboration without negative side effects.
AI Masters Emotional Speech by Focusing on Voice
A new method improves text-to-speech systems by targeting the most expressive parts of speech, enabling more natural and emotional synthetic voices for virtual assistants and audiobooks.
AI Listens for Depression in Everyday Speech
A new system analyzes voice patterns at home to detect early signs of depression, linking subtle acoustic changes to clinical symptoms without compromising privacy.
SceneGuard: How Audible Background Noise Could Protect Your Voice from AI Cloning
In an era where AI voice cloning technologies are becoming increasingly sophisticated, the privacy risks associated with unauthorized speech synthesis are more pressing than ever. Deep learning models…
Step-Audio-R1: Unlocking Deliberate Reasoning in Audio Intelligence
For years, artificial intelligence has thrived on chain-of-thought reasoning, enabling models to tackle complex problems in text and vision by thinking step-by-step, but the audio domain has stubbornl…
New Audio Codec Balances Compression and Speed
OBHS algorithm achieves high efficiency for real-time streaming with minimal latency
Quantum Harmonic Oscillator Simulated on IBM Quantum Chip
For the first time, researchers have successfully simulated a quantum harmonic oscillator on IBM's quantum hardware, demonstrating particle dynamics under time-varying forces that were previously impo…
AI Generates High-Fidelity Audio That's Nearly Indistinguishable from Original Recordings
New neural network approach creates high-resolution audio from low-quality sources, achieving near-perfect quality that listeners can't distinguish from real recordings in most cases.
AI Learns to Understand Speech Patterns Without Human Labels
New method discovers phonetic units in speech by predicting future audio segments, outperforming traditional reconstruction approaches
AI Finds Optimal Way to Recognize Speakers
A new scoring method proves mathematically ideal for identifying and verifying speakers, improving accuracy in voice-based systems without complex tweaks.
AI Improves Voice Assistant Understanding by Embracing Uncertainty
A new method uses AI to interpret speech more accurately by considering multiple possible transcriptions, reducing errors in noisy environments.
AI Judges People by Their Voice Tone
AI judges your emotions and leadership potential based on voice tone alone, revealing hidden biases that could impact hiring and therapy decisions.
AI Creates Living Sound Archive for Artists
AI preserves artistic legacies by generating new works in an artist's style long after they're gone. This living sound archive offers a revolutionary way to keep creativity alive forever.
AI Audio Models Shrink Without Losing Accuracy
AI audio models shrink by 80% without losing accuracy, making them perfect for phones and smart devices. This breakthrough reduces energy use and carbon emissions while keeping performance high.
AI Learns to Filter Speech Recognition Errors
AI learns to filter out speech recognition errors, dramatically improving accuracy for names and technical terms. This breakthrough makes virtual assistants and transcription services far more reliable.
AI Struggles to Understand Children's Voices
Voice AI Fails Kids: Why Your Child's Commands Go Unheard - New research reveals AI misunderstands children's speech 4x more often, creating barriers for education and technology access.
AI Can't Actually Hear Music, Study Reveals
AI can't actually hear music, new study reveals, exposing a critical gap in artificial intelligence. While systems read musical notation perfectly, they fail with real audio recordings.
AI Generates Realistic Room Sounds from Text
Now AI can create realistic room sounds from simple text descriptions, revolutionizing audio production. Imagine describing any space and instantly hearing its exact acoustic character.
AI Edits Speech Like Never Before
New open-source model transforms synthetic voices with precise emotional control and paralinguistic editing, outperforming commercial systems without complex disentanglement methods
AI Sound Generators Show Surprising Limits
New analysis reveals text-to-audio models often produce repetitive or inaccurate sounds, challenging their use in creative fields.