Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble

Artículo Materias > Ingeniería
Materias > Psicología Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Artículos y libros
Universidad Internacional do Cuanza > Investigación > Producción Científica Abierto Inglés Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances. metadata Rizwan, Muhammad; Mushtaq, Muhammad Faheem; Rafiq, Maryam; Mehmood, Arif; Diez, Isabel de la Torre; Gracia Villar, Mónica; Garay, Helena y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, monica.gracia@uneatlantico.es, helena.garay@uneatlantico.es, SIN ESPECIFICAR (2024) Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble. Computers, Materials & Continua, 78 (2). pp. 2047-2066. ISSN 1546-2226

Texto
TSP_CMC_37347.pdf
Available under License Creative Commons Attribution.
Descargar (861kB)

URL Oficial: http://doi.org/10.32604/cmc.2024.037347

Resumen

Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.

Tipo de Documento:	Artículo
Palabras Clave:	Depression classification; deep learning; FastText; machine learning
Clasificación temática:	Materias > Ingeniería Materias > Psicología
Divisiones:	Universidad Europea del Atlántico > Investigación > Producción Científica Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad Internacional Iberoamericana Puerto Rico > Investigación > Artículos y libros Universidad Internacional do Cuanza > Investigación > Producción Científica
Depositado:	14 Mar 2024 23:30
Ultima Modificación:	14 Mar 2024 23:30
URI:	https://repositorio.unib.org/id/eprint/11264

Acciones (logins necesarios)

Ver Objeto

open

Evaluating the impact of deep learning approaches on solar and photovoltaic power forecasting: A systematic review

Accurate solar and photovoltaic (PV) power forecasting is essential for optimizing grid integration, managing energy storage, and maximizing the efficiency of solar power systems. Deep learning (DL) models have shown promise in this area due to their ability to learn complex, non-linear relationships within large datasets. This study presents a systematic literature review (SLR) of deep learning applications for solar PV forecasting, addressing a gap in the existing literature, which often focuses on traditional ML or broader renewable energy applications. This review specifically aims to identify the DL architectures employed, preprocessing and feature engineering techniques used, the input features leveraged, evaluation metrics applied, and the persistent challenges in this field. Through a rigorous analysis of 26 selected papers from an initial set of 155 articles retrieved from the Web of Science database, we found that Long Short-Term Memory (LSTM) networks were the most frequently used algorithm (appearing in 32.69% of the papers), closely followed by Convolutional Neural Networks (CNNs) at 28.85%. Furthermore, Wavelet Transform (WT) was found to be the most prominent data decomposition technique, while Pearson Correlation was the most used for feature selection. We also found that ambient temperature, pressure, and humidity are the most common input features. Our systematic evaluation provides critical insights into state-of-the-art DL-based solar forecasting and identifies key areas for upcoming research. Future research should prioritize the development of more robust and interpretable models, as well as explore the integration of multi-source data to further enhance forecasting accuracy. Such advancements are crucial for the effective integration of solar energy into future power grids.

Producción Científica

Oussama Khouili mail , Mohamed Hanine mail , Mohamed Louzazni mail , Miguel Ángel López Flores mail miguelangel.lopez@uneatlantico.es, Eduardo García Villena mail eduardo.garcia@uneatlantico.es, Imran Ashraf mail ,

Khouili

open

Novel hybrid transfer neural network for wheat crop growth stages recognition using field images

Wheat is one of the world’s most widely cultivated cereal crops and is a primary food source for a significant portion of the population. Wheat goes through several distinct developmental phases, and accurately identifying these stages is essential for precision farming. Determining wheat growth stages accurately is crucial for increasing the efficiency of agricultural yield in wheat farming. Preliminary research identified obstacles in distinguishing between these stages, negatively impacting crop yields. To address this, this study introduces an innovative approach, MobDenNet, based on data collection and real-time wheat crop stage recognition. The data collection utilized a diverse image dataset covering seven growth phases ‘Crown Root’, ‘Tillering’, ‘Mid Vegetative’, ‘Booting’, ‘Heading’, ‘Anthesis’, and ‘Milking’, comprising 4496 images. The collected image dataset underwent rigorous preprocessing and advanced data augmentation to refine and minimize biases. This study employed deep and transfer learning models, including MobileNetV2, DenseNet-121, NASNet-Large, InceptionV3, and a convolutional neural network (CNN) for performance comparison. Experimental evaluations demonstrated that the transfer model MobileNetV2 achieved 95% accuracy, DenseNet-121 achieved 94% accuracy, NASNet-Large achieved 76% accuracy, InceptionV3 achieved 74% accuracy, and the CNN achieved 68% accuracy. The proposed novel hybrid approach, MobDenNet, that synergistically merges the architectures of MobileNetV2 and DenseNet-121 neural networks, yields highly accurate results with precision, recall, and an F1 score of 99%. We validated the robustness of the proposed approach using the k-fold cross-validation. The proposed research ensures the detection of growth stages with great promise for boosting agricultural productivity and management practices, empowering farmers to optimize resource distribution and make informed decisions.

Producción Científica

Aisha Naseer mail , Madiha Amjad mail , Ali Raza mail , Kashif Munir mail , Aseel Smerat mail , Henry Fabian Gongora mail henry.gongora@uneatlantico.es, Carlos Eduardo Uc Ríos mail carlos.uc@unini.edu.mx, Imran Ashraf mail ,

Naseer

open

Client engagement solution for post implementation issues in software industry using blockchain

In the rapidly advanced and evolving information technology industry, adequate client engagement plays a critical role as it is very important to understand the client’s concerns, and requirements, have the records, authorizations, and go-ahead of previously agreed requirements, and provide the feasible solution accordingly. Previously multiple solutions have been proposed to enhance the efficiency of client engagement, but they lack traceability, trust, transparency, and conflict in agreements of previous contracts. Due to the lack of these shortcomings, the client requirement is getting delayed which is causing client escalations, integrity issues, project failure, and penalties. In this study, we proposed the UniferCollab framework to overcome the issues of collaboration between various teams, transparency, the record of client authorizations, and the go-ahead on previous developments by implementing blockchain technology. We store the data on the permissible network in the proposed approach. It allows us to compile all the requirements and information shared by clients on permissible blockchain to secure a large amount of data which enhances the traceability of all the requirements. All the authorizations from the client generate push notifications for any changes in their current system executed through smart contracts. It removes the ambiguity between various development teams if the client has only shared the requirement with one team. The data is stored in the decentralized network from where information is gathered which resolves the traceability, transparency, and trust issues. Lastly, evaluations involved a total of 800 hypertext transfer protocol (HTTP) requests tested using Postman with blockchain block sizes ranging from 0.568 KB to 550 KB and an average size increase of 280 KB was observed as new blocks were added. The longest chain in the network was observed during 800 repetitions of blockchain operations. Latency analysis revealed that delays in processing HTTP requests were influenced by decentralized node processing, local machine response times, and internet bandwidth through various experiments. Results show that the proposed framework resolves all client engagement issues in implementation between all stakeholders which enhances trust, and transparency improves client experience and helps us manage disputes effectively.

Producción Científica

Muhammad Shoaib Farooq mail , Khurram Irshad mail , Danish Riaz mail , Nagwan Abdel Samee mail , Ernesto Bautista Thompson mail ernesto.bautista@unini.edu.mx, Daniel Gavilanes Aray mail daniel.gavilanes@uneatlantico.es, Imran Ashraf mail ,

Farooq

open

Advancing Nutritional Science: Contemporary Perspectives on Diet’s Role in Metabolic Health and Disease Prevention

This Special Issue of Diet and Nutrition: Metabolic Diseases showcases cutting-edge research exploring the intersection between nutrition, dietary patterns, and public health. The contributions in this collection involve both fundamental and applied research, offering new insights into how nutrition can combat the growing global burden of non-communicable diseases [1]. The studies in this issue emphasize the critical role that diet plays in promoting metabolic health, preventing chronic diseases, and improving overall quality of life. In recent years, nutrition has become a central focus in global health efforts, with a growing body of evidence demonstrating its impact on both individual and population-level outcomes [2,3]. This Special Issue encompasses several key themes, including the role of dietary interventions in managing metabolic disorders, the importance of nutrient timing and quality, and the broader implications of sustainable dietary practices.

Producción Científica

Iñaki Elío Pascual mail inaki.elio@uneatlantico.es,

Elío Pascual

open

Ensemble stacked model for enhanced identification of sentiments from IMDB reviews

The emergence of social media platforms led to the sharing of ideas, thoughts, events, and reviews. The shared views and comments contain people’s sentiments and analysis of these sentiments has emerged as one of the most popular fields of study. Sentiment analysis in the Urdu language is an important research problem similar to other languages, however, it is not investigated very well. On social media platforms like X (Twitter), billions of native Urdu speakers use the Urdu script which makes sentiment analysis in the Urdu language important. In this regard, an ensemble model RRLS is proposed that stacks random forest, recurrent neural network, logistic regression (LR), and support vector machine (SVM). The Internet Movie Database (IMDB) movie reviews and Urdu tweets are examined in this study using Urdu sentiment analysis. The Urdu hack library was used to preprocess the Urdu data, which includes preprocessing operations including normalizing individual letters, merging them, including spaces, etc. concerning punctuation. The problem of accurately encoding Urdu characters and replacing Arabic letters with their Urdu equivalents is fixed by the normalization module. Several models are adopted in this study for extensive evaluation of their accuracy for Urdu sentiment analysis. While the results promising, among machine learning models, the SVM and LR attained an accuracy of 87%, according to performance criteria such as F-measure, accuracy, recall, and precision. The accuracy of the long short-term memory (LSTM) and bidirectional LSTM (BiLSTM) was 84%. The suggested ensemble RRLS model performs better than other learning algorithms and achieves a 90% accuracy rate, outperforming current methods. The use of the synthetic minority oversampling technique (SMOTE) is observed to improve the performance and lead to 92.77% accuracy.

Producción Científica

Komal Azim mail , Alishba Tahir mail , Mobeen Shahroz mail , Hanen Karamti mail , Annia A. Vázquez mail annia.almeyda@uneatlantico.es, Angel Olider Rojas Vistorte mail angel.rojas@uneatlantico.es, Imran Ashraf mail ,

Azim

Enlaces de interés

Enlaces de interés

Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble

Resumen

Acciones (logins necesarios)

TEMÁTICA

ACCESO

IDIOMA

Filtros