Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble
Artículo
Materias > Ingeniería
Materias > Psicología
Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Artículos y libros
Universidad Internacional do Cuanza > Investigación > Producción Científica
Abierto
Inglés
Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.
metadata
Rizwan, Muhammad; Mushtaq, Muhammad Faheem; Rafiq, Maryam; Mehmood, Arif; Diez, Isabel de la Torre; Gracia Villar, Mónica; Garay, Helena y Ashraf, Imran
mail
SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, monica.gracia@uneatlantico.es, helena.garay@uneatlantico.es, SIN ESPECIFICAR
(2024)
Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble.
Computers, Materials & Continua, 78 (2).
pp. 2047-2066.
ISSN 1546-2226
![]() |
Texto
TSP_CMC_37347.pdf Available under License Creative Commons Attribution. Descargar (861kB) |
Resumen
Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.
Tipo de Documento: | Artículo |
---|---|
Palabras Clave: | Depression classification; deep learning; FastText; machine learning |
Clasificación temática: | Materias > Ingeniería Materias > Psicología |
Divisiones: | Universidad Europea del Atlántico > Investigación > Producción Científica Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Artículos y libros Universidad Internacional do Cuanza > Investigación > Producción Científica |
Depositado: | 14 Mar 2024 23:30 |
Ultima Modificación: | 14 Mar 2024 23:30 |
URI: | https://repositorio.unib.org/id/eprint/11264 |
Acciones (logins necesarios)
![]() |
Ver Objeto |
<a class="ep_document_link" href="/17858/1/s41598-025-18979-8.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Detection and classification of brain tumor using a hybrid learning model in CT scan images
Accurate diagnosis of brain tumors is critical in understanding the prognosis in terms of the type, growth rate, location, removal strategy, and overall well-being of the patients. Among different modalities used for the detection and classification of brain tumors, a computed tomography (CT) scan is often performed as an early-stage procedure for minor symptoms like headaches. Automated procedures based on artificial intelligence (AI) and machine learning (ML) methods are used to detect and classify brain tumors in Computed Tomography (CT) scan images. However, the key challenges in achieving the desired outcome are associated with the model’s complexity and generalization. To address these issues, we propose a hybrid model that extracts features from CT images using classical machine learning. Additionally, although MRI is a common modality for brain tumor diagnosis, its high cost and longer acquisition time make CT scans a more practical choice for early-stage screening and widespread clinical use. The proposed framework has different stages, including image acquisition, pre-processing, feature extraction, feature selection, and classification. The hybrid architecture combines features from ResNet50, AlexNet, LBP, HOG, and median intensity, classified using a multilayer perceptron. The selection of the relevant features in our proposed hybrid model was extracted using the SelectKBest algorithm. Thus, it optimizes the proposed model performance. In addition, the proposed model incorporates data augmentation to handle the imbalanced datasets. We employed a scoring function to extract the features. The Classification is ensured using a multilayer perceptron neural network (MLP). Unlike most existing hybrid approaches, which primarily target MRI-based brain tumor classification, our method is specifically designed for CT scan images, addressing their unique noise patterns and lower soft-tissue contrast. To the best of our knowledge, this is the first work to integrate LBP, HOG, median intensity, and deep features from both ResNet50 and AlexNet in a structured fusion pipeline for CT brain tumor classification. The proposed hybrid model is tested on data from numerous sources and achieved an accuracy of 94.82%, precision of 94.52%, specificity of 98.35%, and sensitivity of 94.76% compared to state-of-the-art models. While MRI-based models often report higher accuracies, the proposed model achieves 94.82% on CT scans, within 3–4% of leading MRI-based approaches, demonstrating strong generalization despite the modality difference. The proposed hybrid model, combining hand-crafted and deep learning features, effectively improves brain tumor detection and classification accuracy in CT scans. It has the potential for clinical application, aiding in early and accurate diagnosis. Unlike MRI, which is often time-intensive and costly, CT scans are more accessible and faster to acquire, making them suitable for early-stage screening and emergency diagnostics. This reinforces the practical and clinical value of the proposed model in real-world healthcare settings.
Roja Ghasemi mail , Naveed Islam mail , Samin Bayat mail , Muhammad Shabir mail , Shahid Rahman mail , Farhan Amin mail , Isabel de la Torre mail , Ángel Gabriel Kuc Castilla mail angel.kuc@uneatlantico.es, Debora L. Ramírez-Vargas mail debora.ramirez@unini.edu.mx,
Ghasemi
<a href="/17849/1/1-s2.0-S2590005625001043-main.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Ultra Wideband radar-based gait analysis for gender classification using artificial intelligence
Gender classification plays a vital role in various applications, particularly in security and healthcare. While several biometric methods such as facial recognition, voice analysis, activity monitoring, and gait recognition are commonly used, their accuracy and reliability often suffer due to challenges like body part occlusion, high computational costs, and recognition errors. This study investigates gender classification using gait data captured by Ultra-Wideband radar, offering a non-intrusive and occlusion-resilient alternative to traditional biometric methods. A dataset comprising 163 participants was collected, and the radar signals underwent preprocessing, including clutter suppression and peak detection, to isolate meaningful gait cycles. Spectral features extracted from these cycles were transformed using a novel integration of Feedforward Artificial Neural Networks and Random Forests , enhancing discriminative power. Among the models evaluated, the Random Forest classifier demonstrated superior performance, achieving 94.68% accuracy and a cross-validation score of 0.93. The study highlights the effectiveness of Ultra-wideband radar and the proposed transformation framework in advancing robust gender classification.
Adil Ali Saleem mail , Hafeez Ur Rehman Siddiqui mail , Muhammad Amjad Raza mail , Sandra Dudley mail , Julio César Martínez Espinosa mail ulio.martinez@unini.edu.mx, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Isabel de la Torre Díez mail ,
Saleem
<a href="/17857/1/excli2025-8779.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Polyphenols are naturally occurring compounds that can be found in plant-based foods, including fruits, vegetables, nuts, seeds, herbs, spices, and beverages, the use of which has been linked to enhanced brain health and cognitive function. These natural molecules are broadly classified into two main groups: flavonoids and non-flavonoid polyphenols, the latter including phenolic acids, stilbenes, and tannins. Flavonoids are primarily known for their potent antioxidant properties, which help neutralize harmful reactive oxygen species (ROS) in the brain, thereby reducing oxidative stress, a key contributor to neurodegenerative diseases. In addition to their antioxidant effects, flavonoids have been shown to modulate inflammation, enhance neuronal survival, and support neurogenesis, all of which are critical for maintaining cognitive function. Phenolic acids possess strong antioxidant properties and are believed to protect brain cells from oxidative damage. Neuroprotective effects of these molecules can also depend on their ability to modulate signaling pathways associated with inflammation and neuronal apoptosis. Among polyphenols, hydroxycinnamic acids such as caffeic acid have been shown to enhance blood-brain barrier permeability, which may increase the delivery of other protective compounds to the brain. Another compound of interest is represented by resveratrol, a stilbene extensively studied for its potential neuroprotective properties related to its ability to activate the sirtuin pathway, a molecular signaling pathway involved in cellular stress response and aging. Lignans, on the other hand, have shown promise in reducing neuroinflammation and oxidative stress, which could help slow the progression of neurodegenerative diseases and cognitive decline. Polyphenols belonging to different subclasses, such as flavonoids, phenolic acids, stilbenes, and lignans, exert neuroprotective effects by regulating microglial activation, suppressing pro-inflammatory cytokines, and mitigating oxidative stress. These compounds act through multiple signaling pathways, including NF-κB, MAPK, and Nrf2, and they may also influence genetic regulation of inflammation and immune responses at brain level. Despite their potential for brain health and cognitive function, polyphenols are often characterized by low bioavailability, something that deserves attention when considering their therapeutic potential. Future translational studies are needed to better understand the right dosage, the overall diet, the correct target population, as well as ideal formulations allowing to overcome bioavailability limitations.
Justyna Godos mail , Giuseppe Carota mail , Giuseppe Caruso mail , Agnieszka Micek mail , Evelyn Frias-Toral mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Julién Brito Ballester mail julien.brito@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Carmen Lilí Rodríguez Velasco mail carmen.rodriguez@uneatlantico.es, José L. Quiles mail jose.quiles@uneatlantico.es,
Godos
<a class="ep_document_link" href="/17844/1/frai-1-1572645.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
A systematic review of deep learning methods for community detection in social networks
Introduction: The rapid expansion of generated data through social networks has introduced significant challenges, which underscores the need for advanced methods to analyze and interpret these complex systems. Deep learning has emerged as an effective approach, offering robust capabilities to process large datasets, and uncover intricate relationships and patterns. Methods: In this systematic literature review, we explore research conducted over the past decade, focusing on the use of deep learning techniques for community detection in social networks. A total of 19 studies were carefully selected from reputable databases, including the ACM Library, Springer Link, Scopus, Science Direct, and IEEE Xplore. This review investigates the employed methodologies, evaluates their effectiveness, and discusses the challenges identified in these works. Results: Our review shows that models like graph neural networks (GNNs), autoencoders, and convolutional neural networks (CNNs) are some of the most commonly used approaches for community detection. It also examines the variety of social networks, datasets, evaluation metrics, and employed frameworks in these studies. Discussion: However, the analysis highlights several challenges, such as scalability, understanding how the models work (interpretability), and the need for solutions that can adapt to different types of networks. These issues stand out as important areas that need further attention and deeper research. This review provides meaningful insights for researchers working in social network analysis. It offers a detailed summary of recent developments, showcases the most impactful deep learning methods, and identifies key challenges that remain to be explored.
Mohamed El-Moussaoui mail , Mohamed Hanine mail , Ali Kartit mail , Mónica Gracia Villar mail monica.gracia@uneatlantico.es, Helena Garay mail helena.garay@uneatlantico.es, Isabel de la Torre Díez mail ,
El-Moussaoui
<a class="ep_document_link" href="/17853/1/fmed-12-1600855.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Transformer-based ECG classification for early detection of cardiac arrhythmias
Electrocardiogram (ECG) classification plays a critical role in early detection and trocardiogram (ECG) classification plays a critical role in early detection and monitoring cardiovascular diseases. This study presents a Transformer-based deep learning framework for automated ECG classification, integrating advanced preprocessing, feature selection, and dimensionality reduction techniques to improve model performance. The pipeline begins with signal preprocessing, where raw ECG data are denoised, normalized, and relabeled for compatibility with attention-based architectures. Principal component analysis (PCA), correlation analysis, and feature engineering is applied to retain the most informative features. To assess the discriminative quality of the selected features, t-distributed stochastic neighbor embedding (t-SNE) is used for visualization, revealing clear class separability in the transformed feature space. The refined dataset is then input to a Transformer- based model trained with optimized loss functions, regularization strategies, and hyperparameter tuning. The proposed model demonstrates strong performance on the MIT-BIH benchmark dataset, showing results consistent with or exceeding prior studies. However, due to differences in datasets and evaluation protocols, these comparisons are indicative rather than conclusive. The model effectively classifies ECG signals into categories such as Normal, atrial premature contraction (APC), ventricular premature contraction (VPC), and Fusion beats. These results underscore the effectiveness of Transformer-based models in biomedical signal processing and suggest potential for scalable, automated ECG diagnostics. However, deployment in real-time or resource-constrained settings will require further optimization and validation.
Sunnia Ikram mail , Amna Ikram mail , Harvinder Singh mail , Malik Daler Ali Awan mail , Sajid Naveed mail , Isabel De la Torre Díez mail , Henry Fabian Gongora mail henry.gongora@uneatlantico.es, Thania Chio Montero mail ,
Ikram