Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble

Article Subjects > Engineering
Subjects > Psychology
Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Articles and books
Universidad Internacional do Cuanza > Research > Scientific Production
Abierto Inglés Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances. metadata Rizwan, Muhammad and Mushtaq, Muhammad Faheem and Rafiq, Maryam and Mehmood, Arif and Diez, Isabel de la Torre and Gracia Villar, Mónica and Garay, Helena and Ashraf, Imran mail UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, monica.gracia@uneatlantico.es, helena.garay@uneatlantico.es, UNSPECIFIED (2024) Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble. Computers, Materials & Continua, 78 (2). pp. 2047-2066. ISSN 1546-2226

[img] Text
TSP_CMC_37347.pdf
Available under License Creative Commons Attribution.

Download (861kB)

Abstract

Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.

Item Type: Article
Uncontrolled Keywords: Depression classification; deep learning; FastText; machine learning
Subjects: Subjects > Engineering
Subjects > Psychology
Divisions: Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Articles and books
Universidad Internacional do Cuanza > Research > Scientific Production
Date Deposited: 14 Mar 2024 23:30
Last Modified: 14 Mar 2024 23:30
URI: https://repositorio.unib.org/id/eprint/11264

Actions (login required)

View Item View Item

<a href="/17061/1/fspor-1-1565900.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Tensiomyography, functional movement screen and counter movement jump for the assessment of injury risk in sport: a systematic review of original studies of diagnostic tests

Background: Scientific research should be carried out to prevent sports injuries. For this purpose, new assessment technologies must be used to analyze and identify the risk factors for injury. The main objective of this systematic review was to compile, synthesize and integrate international research published in different scientific databases on Countermovement Jump (CMJ), Functional Movement Screen (FMS) and Tensiomyography (TMG) tests and technologies for the assessment of injury risk in sport. This way, this review determines the current state of the knowledge about this topic and allows a better understanding of the existing problems, making easier the development of future lines of research. Methodology: A structured search was carried out following the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines and the PICOS model until November 30, 2024, in the MEDLINE/PubMed, Web of Science (WOS), ScienceDirect, Cochrane Library, SciELO, EMBASE, SPORTDiscus and Scopus databases. The risk of bias was assessed and the PEDro scale was used to analyze methodological quality. Results: A total of 510 articles were obtained in the initial search. After inclusion and exclusion criteria, the final sample was 40 articles. These studies maintained a high standard of quality. This revealed the effects of the CMJ, FMS and TMG methods for sports injury assessment, indicating the sample population, sport modality, assessment methods, type of research design, study variables, main findings and intervention effects. Conclusions: The CMJ vertical jump allows us to evaluate the power capacity of the lower extremities, both unilaterally and bilaterally, detect neuromuscular asymmetries and evaluate fatigue. Likewise, FMS could be used to assess an athlete's basic movement patterns, mobility and postural stability. Finally, TMG is a non-invasive method to assess the contractile properties of superficial muscles, monitor the effects of training, detect muscle asymmetries, symmetries, provide information on muscle tone and evaluate fatigue. Therefore, they should be considered as assessment tests and technologies to individualize training programs and identify injury risk factors.

Producción Científica

Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es, Antonio Bores-Cerezal mail antonio.bores@uneatlantico.es, Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Julio Calleja-González mail ,

Velarde-Sotres

<a class="ep_document_link" href="/17139/1/s41598-025-89266-9.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Harnessing AI forward and backward chaining with telemetry data for enhanced diagnostics and prognostics of smart devices

In the rapidly evolving landscape of artificial intelligence (AI) and the Internet of Things (IoT), the significance of device diagnostics and prognostics is paramount for guaranteeing the dependable operation and upkeep of intricate systems. The capacity to precisely diagnose and preemptively predict potential failures holds the potential to considerably amplify maintenance efficiency, diminish downtime, and optimize resource allocation. The wealth of information offered by telemetry data gathered from IoT devices presents an opportunity for diagnostics and prognostics applications. However, extracting valuable insights and making well-timed decisions from this extensive data reservoir remains a formidable challenge. This study proposes a novel AI-driven framework that integrates forward chaining and backward chaining algorithms to analyze telemetry data from IoT devices. The proposed methodology utilizes rule-based inference to detect real-time anomalies and predict potential future failures, providing a dual-layered approach for diagnostics and prognostics. The results show that the diagnostics engine using forward chaining detects real-time issues like “High Temperature” and “Low Pressure,” while the prognostics engine with backward chaining predicts potential future occurrences of these issues, enabling proactive prevention measures. The experimental results demonstrate that adopting this approach could offer valuable assistance to authorities and stakeholders. Accurate early diagnosis and prediction of potential failures have the capability to greatly improve maintenance efficiency, minimize downtime, and optimize cost.

Producción Científica

Muhammad Shoaib Farooq mail , Rizwan Pervez Mir mail , Atif Alvi mail , Kilian Tutusaus mail kilian.tutusaus@uneatlantico.es, Eduardo García Villena mail eduardo.garcia@uneatlantico.es, Fadwa Alrowais mail , Hanen Karamti mail , Imran Ashraf mail ,

Farooq

<a class="ep_document_link" href="/16577/1/nutrients-17-00521-v2.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Nut Consumption Is Associated with Cognitive Status in Southern Italian Adults

Background: Nut consumption has been considered a potential protective factor against cognitive decline. The aim of this study was to test whether higher total and specific nut intake was associated with better cognitive status in a sample of older Italian adults. Methods: A cross-sectional analysis on 883 older adults (>50 y) was conducted. A 110-item food frequency questionnaire was used to collect information on the consumption of various types of nuts. The Short Portable Mental Status Questionnaire was used to assess cognitive status. Multivariate logistic regression analyses were performed to calculate odds ratios (ORs) and 95% confidence intervals (CIs) for the association between nut intake and cognitive status after adjusting for potential confounding factors. Results: The median intake of total nuts was 11.7 g/day and served as a cut-off to categorize low and high consumers (mean intake 4.3 g/day vs. 39.7 g/day, respectively). Higher total nut intake was significantly associated with a lower prevalence of impaired cognitive status among older individuals (OR = 0.35, CI 95%: 0.15, 0.84) after adjusting for potential confounding factors. Notably, this association remained significant after additional adjustment for adherence to the Mediterranean dietary pattern as an indicator of diet quality, (OR = 0.32, CI 95%: 0.13, 0.77). No significant associations were found between cognitive status and specific types of nuts. Conclusions: Habitual nut intake is associated with better cognitive status in older adults.

Producción Científica

Justyna Godos mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Evelyn Frias-Toral mail , Raynier Zambrano-Villacres mail , Angel Olider Rojas Vistorte mail angel.rojas@uneatlantico.es, Vanessa Yélamos Torres mail vanessa.yelamos@funiber.org, Maurizio Battino mail maurizio.battino@uneatlantico.es, Fabio Galvano mail , Sabrina Castellano mail , Giuseppe Grosso mail ,

Godos

<a class="ep_document_link" href="/16760/1/peerj-cs-2652.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Novel transfer learning approach for hand drawn mathematical geometric shapes classification

Hand-drawn mathematical geometric shapes are geometric figures, such as circles, triangles, squares, and polygons, sketched manually using pen and paper or digital tools. These shapes are fundamental in mathematics education and geometric problem-solving, serving as intuitive visual aids for understanding complex concepts and theories. Recognizing hand-drawn shapes accurately enables more efficient digitization of handwritten notes, enhances educational tools, and improves user interaction with mathematical software. This research proposes an innovative machine learning algorithm for the automatic classification of mathematical geometric shapes to identify and interpret these shapes from handwritten input, facilitating seamless integration with digital systems. We utilized a benchmark dataset of mathematical shapes based on a total of 20,000 images with eight classes circle, kite, parallelogram, square, rectangle, rhombus, trapezoid, and triangle. We introduced a novel machine-learning algorithm CnN-RFc that uses convolution neural networks (CNN) for spatial feature extraction and the random forest classifier for probabilistic feature extraction from image data. Experimental results illustrate that using the CnN-RFc method, the Light Gradient Boosting Machine (LGBM) algorithm surpasses state-of-the-art approaches with high accuracy scores of 98% for hand-drawn shape classification. Applications of the proposed mathematical geometric shape classification algorithm span various domains, including education, where it enhances interactive learning platforms and provides instant feedback to students.

Producción Científica

Aneeza Alam mail , Ali Raza mail , Nisrean Thalji mail , Laith Abualigah mail , Helena Garay mail helena.garay@uneatlantico.es, Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Imran Ashraf mail ,

Alam

<a href="/10290/1/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Influence of E-learning training on the acquisition of competences in basketball coaches in Cantabria

The main aim of this study was to analyse the influence of e-learning training on the acquisition of competences in basketball coaches in Cantabria. The current landscape of basketball coach training shows an increasing demand for innovative training models and emerging pedagogies, including e-learning-based methodologies. The study sample consisted of fifty students from these courses, all above 16 years of age (36 males, 14 females). Among them, 16% resided outside the autonomous community of Cantabria, 10% resided more than 50 km from the city of Santander, 36% between 10 and 50 km, 14% less than 10 km, and 24% resided within Santander city. Data were collected through a Google Forms survey distributed by the Cantabrian Basketball Federation to training course students. Participation was voluntary and anonymous. The survey, consisting of 56 questions, was validated by two sports and health doctors and two senior basketball coaches. The collected data were processed and analysed using Microsoft® Excel version 16.74, and the results were expressed in percentages. The analysis revealed that 24.60% of the students trained through the e-learning methodology considered themselves fully qualified as basketball coaches, contrasting with 10.98% of those trained via traditional face-to-face methodology. The results of the study provide insights into important characteristics that can be adjusted and improved within the investigated educational process. Moreover, the study concludes that e-learning training effectively qualifies basketball coaches in Cantabria.

Producción Científica

Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es, Javier Jorge mail , Kamil Giglio mail ,

Alemany Iturriaga