AIResearch

Sound

AI

AI Learns to Balance Speech Sounds and Emotion for Natural Voices

A new training method helps AI generate more natural-sounding speech by teaching it to distinguish between different sounds while capturing emotional nuances, outperforming existing approaches in clarity and speaker similarity.

Apr 4 4 min read
Sound

Better Data, Not More Data, Drives AI Audio Breakthrough

A new study shows that high-quality labels, not massive datasets, are key to training AI that understands speech, music, and sounds—outperforming models with five times more data.

Mar 31 4 min read
AI

AI Gains a Better Ear for Real Conversations

A new open-source pipeline processes messy, overlapping speech to train AI assistants that can listen and talk simultaneously, making human-computer interaction more natural.

Mar 30 4 min read
Data

AI Models Struggle to Read Music Like Humans Do

A new benchmark reveals that large language and vision models fail to understand complete musical scores, with visual recognition performing especially poorly, but a text-based notation offers a path forward.

Mar 27 4 min read
Sound

Robots Sound Better When They Use Spatial Audio

A new study shows that spatial audio can make robots seem warmer and less discomforting, enhancing human collaboration without negative side effects.

Mar 26 4 min read
Data

AI Masters Emotional Speech by Focusing on Voice

A new method improves text-to-speech systems by targeting the most expressive parts of speech, enabling more natural and emotional synthetic voices for virtual assistants and audiobooks.

Mar 26 3 min read
Science

AI Listens for Depression in Everyday Speech

A new system analyzes voice patterns at home to detect early signs of depression, linking subtle acoustic changes to clinical symptoms without compromising privacy.

Mar 26 4 min read
AI

SceneGuard: How Audible Background Noise Could Protect Your Voice from AI Cloning

In an era where AI voice cloning technologies are becoming increasingly sophisticated, the privacy risks associated with unauthorized speech synthesis are more pressing than ever. Deep learning models…

Nov 22 4 min read
Data

Step-Audio-R1: Unlocking Deliberate Reasoning in Audio Intelligence

For years, artificial intelligence has thrived on chain-of-thought reasoning, enabling models to tackle complex problems in text and vision by thinking step-by-step, but the audio domain has stubbornl…

Nov 22 4 min read
Network

New Audio Codec Balances Compression and Speed

OBHS algorithm achieves high efficiency for real-time streaming with minimal latency

Nov 20 3 min read
Quantum Computing

Quantum Harmonic Oscillator Simulated on IBM Quantum Chip

For the first time, researchers have successfully simulated a quantum harmonic oscillator on IBM's quantum hardware, demonstrating particle dynamics under time-varying forces that were previously impo…

Nov 15 3 min read
AI

AI Generates High-Fidelity Audio That's Nearly Indistinguishable from Original Recordings

New neural network approach creates high-resolution audio from low-quality sources, achieving near-perfect quality that listeners can't distinguish from real recordings in most cases.

Nov 14 3 min read
Sound

AI Learns to Understand Speech Patterns Without Human Labels

New method discovers phonetic units in speech by predicting future audio segments, outperforming traditional reconstruction approaches

Nov 14 3 min read
AI

AI Finds Optimal Way to Recognize Speakers

A new scoring method proves mathematically ideal for identifying and verifying speakers, improving accuracy in voice-based systems without complex tweaks.

Nov 14 3 min read
Sound

AI Improves Voice Assistant Understanding by Embracing Uncertainty

A new method uses AI to interpret speech more accurately by considering multiple possible transcriptions, reducing errors in noisy environments.

Nov 14 3 min read
Science

AI Judges People by Their Voice Tone

AI judges your emotions and leadership potential based on voice tone alone, revealing hidden biases that could impact hiring and therapy decisions.

Nov 14 3 min read
AI

AI Creates Living Sound Archive for Artists

AI preserves artistic legacies by generating new works in an artist's style long after they're gone. This living sound archive offers a revolutionary way to keep creativity alive forever.

Nov 14 3 min read
Sound

AI Audio Models Shrink Without Losing Accuracy

AI audio models shrink by 80% without losing accuracy, making them perfect for phones and smart devices. This breakthrough reduces energy use and carbon emissions while keeping performance high.

Nov 14 3 min read
AI

AI Learns to Filter Speech Recognition Errors

AI learns to filter out speech recognition errors, dramatically improving accuracy for names and technical terms. This breakthrough makes virtual assistants and transcription services far more reliable.

Nov 14 3 min read
AI

AI Struggles to Understand Children's Voices

Voice AI Fails Kids: Why Your Child's Commands Go Unheard - New research reveals AI misunderstands children's speech 4x more often, creating barriers for education and technology access.

Nov 14 3 min read
Science

AI Can't Actually Hear Music, Study Reveals

AI can't actually hear music, new study reveals, exposing a critical gap in artificial intelligence. While systems read musical notation perfectly, they fail with real audio recordings.

Nov 14 3 min read
Games

AI Generates Realistic Room Sounds from Text

Now AI can create realistic room sounds from simple text descriptions, revolutionizing audio production. Imagine describing any space and instantly hearing its exact acoustic character.

Nov 14 3 min read
Sound

AI Edits Speech Like Never Before

New open-source model transforms synthetic voices with precise emotional control and paralinguistic editing, outperforming commercial systems without complex disentanglement methods

Nov 6 3 min read
Science

AI Sound Generators Show Surprising Limits

New analysis reveals text-to-audio models often produce repetitive or inaccurate sounds, challenging their use in creative fields.

Nov 5 3 min read