Englander Institute for Precision Medicine

The Sounds of Science

New research by EIPM's Dr. Olivier Elemento and colleagues

Sound as Biomarker

Dr. Anaïs Rameau is the granddaughter of a poet who lost his voice to cancer. Though he died when she was young, that spiritual connection steered Dr. Rameau into otolaryngology, where she now helps patients with difficulty speaking, swallowing or breathing.

Recently, Dr. Rameau has drawn attention for her pioneering research using patient voice analysis to detect aspiration — the inhalation of food or liquid into the lungs. Aspiration is made worse by difficulty swallowing and can lead to pneumonia, the top cause of death for people with certain age-related conditions, such as dementia and Parkinson’s disease.

“My grandfather didn’t die from his cancer. He died of pneumonia possibly brought on by changes in his swallowing,” Dr. Rameau recalls. “The changes in swallowing are audible. We can hear it. And now, we are using acoustic analysis to detect difficulties swallowing and coughing that might not be otherwise apparent to screen for pneumonia risk.”

Dr. Rameau recently earned a Beeson Emerging Leaders Career Development Award from the National Institute on Aging to develop a bedside app that tests for aspiration using sound. She now investigates whether vocal analysis, aided by artificial intelligence, could provide biomarkers for a broad array of physical, neurological and mental illnesses, ranging from cancer and heart disease to Alzheimer’s and autism. All can have associated changes to the voice, and ultimately, Dr. Rameau says, their biomarkers could lead to earlier, more accurate diagnoses, allow doctors to monitor patients’ health from afar, and even gauge how well treatments are working.

“The voice is connected to so many physical and mental processes,” Dr. Rameau says. “We think it is a window into illness and dysfunction.”

A Little Birdy Told Me

Intrigued by the potential of this under-explored area of medical science, but lacking acoustics expertise, Dr. Rameau turned to the highly regarded Cornell Lab of Ornithology, where scientists have spent decades studying birdsongs. There she met Dr. Holger Klinck, director of the K. Lisa Yang Center for Conservation Bioacoustics and one of the world’s foremost experts in the sounds of the animal kingdom.

Dr. Klinck’s team developed Raven Pro, a powerful software used to analyze animal vocalizations — birdsongs especially, but also sounds made by elephants, whales and other “critters,” as Dr. Klinck calls his subjects.

Raven Pro turns birdsongs, whale songs, elephant calls and primate chatter into beautiful visual representations. To the uninitiated, a chart of the trills of a melodic birdsong might resemble the elegant calligraphy of a Chinese brushstroke master. The multi-hued spectrograms go deeper still, displaying sound in three dimensions: time, pitch and intensity (volume) in colorful heat maps.

Dr. Klinck and team draw out understanding of animal communication and behavior from these representations and that is how they came to help Dr. Rameau screen for swallowing dysfunction and pneumonia risk from a simple cough. The two labs used Raven Pro to compare and analyze spectrograms of patient coughs to identify cough weakness and to show that people with weaker coughs were at greater risk 
of pneumonia.

Dr. Rameau was then able to use these very same tools to evaluate the effectiveness of corrective injection laryngoplasty — the injection of biological and biomimetic materials into the larynx to help patients with trouble swallowing and weakened voice.

“Just as it can tell us what’s wrong, sound can also tell us what’s right,” Dr. Rameau says.

Feed the Algorithm

More recently, Dr. Rameau has been testing her hypothesis through advances in artificial intelligence. She has teamed with Dr. Olivier Elemento, director of the Englander Institute for Precision Medicine and co-principal investigator of the NIH-funded BRIDGE2AI Voice project that is exploring the use of machine learning to analyze voice recordings for biomarkers. AI has already been used to analyze digitized X-rays, CT scans, MRIs, biopsy slides and even written medical records to search for hidden patterns of disease and dysfunction. And as telemedicine boomed during the COVID pandemic, so did interest in digital biomarkers of sound — among other reasons, to potentially distinguish between cough types in patients being treated remotely.

Dr. Elemento says the greatest challenge is not training the models, but finding the data. Unlike the large language model GPT for words, or ImageNet for digital images, there is no ready collection of standardized voice recordings to analyze to draw out understanding of how the human voice is affected by dementia, dysphagia or any number of medical conditions. “We have to create the data,” he says.

Dr. Elemento estimates BRIDGE2AI Voice will need 40,000 samples from recruited and consenting participants to amass a suitable dataset from which to train an effective and universal model that can diagnose any number of conditions from human sounds. By starting from scratch, Dr. Elemento says, BRIDGE2AI Voice can avoid the sort of racial, gender, age, geographic and other biases that have compromised other medical AI projects. Numerous promising medical AI studies have been challenged, if not negated, for underrepresenting minority populations in their data collection.

“We are thinking ahead of time, instead of after the fact, about bias, about privacy, about protecting patients’ medical data,” he says. “It’s an important advantage.”

Lend a Voice

Helping BRIDGE2AI reach those lofty standards is Alexandros Sigaras, [an Englander Institutue for Precision Medicine Member] a computer scientist who is both creating the app used to collect all those samples and, eventually, training the AI models based on them. The app not only collects the requisite audio samples but includes extensive medical questionnaires and cognitive challenges that will be used later to match diagnoses to voice samples.

“Years from now, you wouldn’t want to be studying the link between voice and dementia only to find you hadn’t collected the right data,” Sigaras says. “That takes extreme care in the collection phase. We want to get it right the first time.“

On the threshold of a promising data collection effort, Dr. Rameau returns to her grandfather whose cancer was evident to his wife in a worsening hoarseness that preceded his diagnosis.

“If my grandmother could understand the connection between my grandfather’s voice and his health,” Dr. Rameau says, “why not an algorithm?”

# # #

The above article first ran in Weill Cornell Medicine's Summer 2024 Impact magazine.

Weill Cornell Medicine Englander Institute for Precision Medicine 413 E 69th Street
Belfer Research Building
New York, NY 10021