Background: Monitoring symptoms of bipolar disorder (BD) is a challenge faced by mental health services. Speech patterns are crucial in assessing the current experiences, emotions, and thought patterns of people with BD. Natural language processing (NLP) and acoustic signal processing may support ongoing BD assessment within a mobile health (mHealth) framework. Objective: Using both acoustic and NLP-based features from the speech of people with BD, we built an app-based tool and tested its feasibility and performance to remotely assess the individual clinical status. Methods: We carried out a pilot, observational study, sampling adults diagnosed with BD from the caseload of the Nord Milano Mental Health Trust (Italy) to explore the relationship between selected speech features and symptom severity and to test their potential to remotely assess mental health status. Symptom severity assessment was based on clinician ratings, using the Young Mania Rating Scale (YMRS) and Montgomery-Åsberg Depression Rating Scale (MADRS) for manic and depressive symptoms, respectively. Leveraging a digital health tool embedded in a mobile app, which records and processes speech, participants self-administered verbal performance tasks. Both NLP-based and acoustic features were extracted, testing associations with mood states and exploiting machine learning approaches based on random forest models. Results: We included 32 subjects (mean [SD] age 49.6 [14.3] years; 50% [16/32] females) with a MADRS median (IQR) score of 13 (21) and a YMRS median (IQR) score of 5 (16). Participants freely managed the digital environment of the app, without perceiving it as intrusive and reporting an acceptable system usability level (average score 73.5, SD 19.7). Small-to-moderate correlations between speech features and symptom severity were uncovered, with sex-based differences in predictive capability. Higher latency time (ρ=0.152), increased silences (ρ=0.416), and vocal perturbations correlated with depressive symptomatology. Pressure of speech based on the mean intraword time (ρ=-0.343) and lower voice instability based on jitter-related parameters (ρ ranging from -0.19 to -0.27) were detected for manic symptoms. However, a higher contribution of NLP-based and conversational features, rather than acoustic features, was uncovered, especially for predictive models for depressive symptom severity (NLP-based: R2=0.25, mean squared error [MSE]=110.07, mean absolute error [MAE]=8.17; acoustics: R2=0.11, MSE=133.75, MAE=8.86; combined: R2=0.16; MSE=118.53, MAE=8.68). Conclusions: Remotely collected speech patterns, including both linguistic and acoustic features, are associated with symptom severity levels and may help differentiate clinical conditions in individuals with BD during their mood state assessments. In the future, multimodal, smartphone-integrated digital ecological momentary assessments could serve as a powerful tool for clinical purposes, remotely complementing standard, in-person mental health evaluations.

Crocamo, C., Cioni, R., Canestro, A., Nasti, C., Palpella, D., Piacenti, S., et al. (2025). Acoustic and Natural Language Markers for Bipolar Disorder: A Pilot, mHealth Cross-Sectional Study. JMIR FORMATIVE RESEARCH, 9 [10.2196/65555].

Acoustic and Natural Language Markers for Bipolar Disorder: A Pilot, mHealth Cross-Sectional Study

Crocamo C.
;
Cioni R. M.;Canestro A.;Nasti C.;Palpella D.;Piacenti S.;Bartoccetti A.;Re M.;Barattieri di San Pietro C.;Bartoli F.;Carra G.
2025

Abstract

Background: Monitoring symptoms of bipolar disorder (BD) is a challenge faced by mental health services. Speech patterns are crucial in assessing the current experiences, emotions, and thought patterns of people with BD. Natural language processing (NLP) and acoustic signal processing may support ongoing BD assessment within a mobile health (mHealth) framework. Objective: Using both acoustic and NLP-based features from the speech of people with BD, we built an app-based tool and tested its feasibility and performance to remotely assess the individual clinical status. Methods: We carried out a pilot, observational study, sampling adults diagnosed with BD from the caseload of the Nord Milano Mental Health Trust (Italy) to explore the relationship between selected speech features and symptom severity and to test their potential to remotely assess mental health status. Symptom severity assessment was based on clinician ratings, using the Young Mania Rating Scale (YMRS) and Montgomery-Åsberg Depression Rating Scale (MADRS) for manic and depressive symptoms, respectively. Leveraging a digital health tool embedded in a mobile app, which records and processes speech, participants self-administered verbal performance tasks. Both NLP-based and acoustic features were extracted, testing associations with mood states and exploiting machine learning approaches based on random forest models. Results: We included 32 subjects (mean [SD] age 49.6 [14.3] years; 50% [16/32] females) with a MADRS median (IQR) score of 13 (21) and a YMRS median (IQR) score of 5 (16). Participants freely managed the digital environment of the app, without perceiving it as intrusive and reporting an acceptable system usability level (average score 73.5, SD 19.7). Small-to-moderate correlations between speech features and symptom severity were uncovered, with sex-based differences in predictive capability. Higher latency time (ρ=0.152), increased silences (ρ=0.416), and vocal perturbations correlated with depressive symptomatology. Pressure of speech based on the mean intraword time (ρ=-0.343) and lower voice instability based on jitter-related parameters (ρ ranging from -0.19 to -0.27) were detected for manic symptoms. However, a higher contribution of NLP-based and conversational features, rather than acoustic features, was uncovered, especially for predictive models for depressive symptom severity (NLP-based: R2=0.25, mean squared error [MSE]=110.07, mean absolute error [MAE]=8.17; acoustics: R2=0.11, MSE=133.75, MAE=8.86; combined: R2=0.16; MSE=118.53, MAE=8.68). Conclusions: Remotely collected speech patterns, including both linguistic and acoustic features, are associated with symptom severity levels and may help differentiate clinical conditions in individuals with BD during their mood state assessments. In the future, multimodal, smartphone-integrated digital ecological momentary assessments could serve as a powerful tool for clinical purposes, remotely complementing standard, in-person mental health evaluations.
Articolo in rivista - Articolo scientifico
acoustic; app; applications; bipolar; bipolar disorders; digital mental health; emotion; emotional; machine learning; markers; mental health; mental illness; mHealth; mobile health; multimodal; natural language processing; NLP; psychiatric; psychiatry; remote assessment; speech; symptom severity; verbal; vocal; voice;
English
20-ago-2024
2025
9
e65555
open
Crocamo, C., Cioni, R., Canestro, A., Nasti, C., Palpella, D., Piacenti, S., et al. (2025). Acoustic and Natural Language Markers for Bipolar Disorder: A Pilot, mHealth Cross-Sectional Study. JMIR FORMATIVE RESEARCH, 9 [10.2196/65555].
File in questo prodotto:
File Dimensione Formato  
Crocamo-2025-JMIR Formative Research-VoR.pdf

accesso aperto

Descrizione: This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/)
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 737.69 kB
Formato Adobe PDF
737.69 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/550662
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact