Human Activity Recognition in Domestic Settings Based on Optical Techniques and Ensemble Models
Article
Subjects > Engineering
Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Articles and Books
Universidad Internacional do Cuanza > Research > Scientific Production
University of La Romana > Research > Scientific Production
Open
English
Human activity recognition (HAR) is essential in many applications, such as smart homes, assisted
living, healthcare monitoring, rehabilitation, physiotherapy, and geriatric care. Conventional methods of
HAR use wearable sensors, e.g., acceleration sensors and gyroscopes. However, they are limited by issues
such as sensitivity to position, user inconvenience, and potential health risks with long-term use. Optical
camera systems that are vision-based provide an alternative that is not intrusive; however, they are
susceptible to variations in lighting, intrusions, and privacy issues. The paper uses an optical method of
recognizing human domestic activities based on pose estimation and deep learning ensemble models. The
skeletal keypoint features proposed in the current methodology are extracted from video data using PoseNet
to generate a privacy-preserving representation that captures key motion dynamics without being sensitive to
changes in appearance. A total of 30 subjects (15 male and 15 female) were sampled across 2734 activity
samples, including nine daily domestic activities. There were six deep learning architectures, namely, the
Transformer (Transformer), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Multilayer Perceptron
(MLP), One-Dimensional Convolutional Neural Network (1D CNN), and a hybrid Convolutional Neural Network–Long
Short-Term Memory (CNN–LSTM) architecture. The results on the hold-out test set show that the CNN–LSTM
architecture achieves an accuracy of 98.78% within our experimental setting. Leave-One-Subject-Out
cross-validation further confirms robust generalization across unseen individuals, with CNN–LSTM achieving a
mean accuracy of 97.21% ± 1.84% across 30 subjects. The results demonstrate that vision-based pose
estimation with deep learning is a useful, precise, and non-intrusive approach to HAR in smart healthcare
and home automation systems.
metadata
Raza, Muhammad Amjad; Mehmood, Nasir; Siddiqui, Hafeez Ur Rehman; Saleem, Adil Ali; Álvarez, Roberto Marcelo; Miró Vera, Yini Airet and Díez, Isabel de la Torre
mail
UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, roberto.alvarez@uneatlantico.es, yini.miro@uneatlantico.es, UNSPECIFIED
(2026)
Human Activity Recognition in Domestic Settings Based on Optical Techniques and Ensemble Models.
Sensors, 26 (5).
p. 1516.
ISSN 1424-8220
|
Text
sensors-26-01516-v2.pdf Available under License Creative Commons Attribution. Download (4MB) |
Abstract
Human activity recognition (HAR) is essential in many applications, such as smart homes, assisted living, healthcare monitoring, rehabilitation, physiotherapy, and geriatric care. Conventional methods of HAR use wearable sensors, e.g., acceleration sensors and gyroscopes. However, they are limited by issues such as sensitivity to position, user inconvenience, and potential health risks with long-term use. Optical camera systems that are vision-based provide an alternative that is not intrusive; however, they are susceptible to variations in lighting, intrusions, and privacy issues. The paper uses an optical method of recognizing human domestic activities based on pose estimation and deep learning ensemble models. The skeletal keypoint features proposed in the current methodology are extracted from video data using PoseNet to generate a privacy-preserving representation that captures key motion dynamics without being sensitive to changes in appearance. A total of 30 subjects (15 male and 15 female) were sampled across 2734 activity samples, including nine daily domestic activities. There were six deep learning architectures, namely, the Transformer (Transformer), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Multilayer Perceptron (MLP), One-Dimensional Convolutional Neural Network (1D CNN), and a hybrid Convolutional Neural Network–Long Short-Term Memory (CNN–LSTM) architecture. The results on the hold-out test set show that the CNN–LSTM architecture achieves an accuracy of 98.78% within our experimental setting. Leave-One-Subject-Out cross-validation further confirms robust generalization across unseen individuals, with CNN–LSTM achieving a mean accuracy of 97.21% ± 1.84% across 30 subjects. The results demonstrate that vision-based pose estimation with deep learning is a useful, precise, and non-intrusive approach to HAR in smart healthcare and home automation systems.
| Document Type: | Article |
|---|---|
| Keywords: | deep learning; human activity recognition; LSTM; PoseNet; skeleton-based recognition; smart home; Transformer |
| Subject classification: | Subjects > Engineering |
| Divisions: | Europe University of Atlantic > Research > Scientific Production Fundación Universitaria Internacional de Colombia > Research > Scientific Production Ibero-american International University > Research > Scientific Production Ibero-american International University > Research > Articles and Books Universidad Internacional do Cuanza > Research > Scientific Production University of La Romana > Research > Scientific Production |
| Deposited: | 30 Mar 2026 21:55 |
| Last Modified: | 30 Mar 2026 21:55 |
| URI: | https://repositorio.unib.org/id/eprint/27968 |
Actions (login required)
![]() |
View Object |
<a href="/28319/1/s41598-026-45575-1_reference.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
A novel approach for disease and pests detection in potato production system based on deep learning
Vulnerability of potato crops to diseases and pest infestation can affect its quality and lead to significant yield losses. Timely detection of such diseases can help take effective decisions. For this purpose, a deep learning-based object detection framework is designed in this study to identify and classify major potato diseases and pests under real-world field conditions. A total of 2,688 field images were collected from two research farms in Punjab, Pakistan, across multiple growth stages in various seasonal conditions. Excluding 285 symptoms-free images from the earliest collection led to 2,403 images which were annotated into four biotic-stress classes: blight disease (n = 630), leaf spot disease (n = 370), leafroll virus (viral symptom complex; n = 888), and Colorado potato beetle (larvae/adults; n = 515), indicating class imbalance. Several state-of-the-art models were used including YOLOv8 variants (n/s/m), YOLOv7, YOLOv5, and Faster R-CNN, and the results are discussed in relation to recent potato disease classification studies involving cropped leaf images. Stratified splitting (70% training, 20% validation, 10% testing) was applied to preserve class distribution across all subsets. YOLOv8-medium achieve the best performance with mean average precision (mAP)@0.5 of 98% on the held-out test images. Results for stable 5-fold cross-validation show a mean mAP@0.5 of 97.8%, which offers a balance between accuracy and inference time. Model robustness was evaluated using 5-fold cross-validation and repeated training with different random seeds, showing a low variance of ±0.4% mAP. Results demonstrate promising outcomes under the real-world field conditions, while, broader cross-region and cross-season validation is intended for the future.
Ahmed Abbas mail , Saif Ur Rehman mail , Khalid Mahmood mail , Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Aseel Smerat mail , Imran Ashraf mail ,
Abbas
<a class="ep_document_link" href="/28569/1/s12870-026-08847-6_reference.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
The Polyphagous Shot Hole Borer (PSHB) is a highly invasive beetle that has been spreading like an epidemic across agricultural and forestry landscapes in recent years. Its rapid and destructive spread has turned it into a major global threat, causing widespread damage that continues to grow with time. Countries like South Africa, the United States, and Australia have implemented extensive measures to control the spread of PSHB, including the establishment of specialized agricultural support centers for early detection. However, there is still a strong need to make PSHB detection more accessible, allowing even non-experts to easily identify infections at an early stage. Artificial Intelligence (AI) has shown great promise in plant disease detection, but a major challenge in the case of PSHB was the lack of a suitable dataset for training AI models. In the proposed work, we first created a dedicated dataset by collecting images of trees infected with PSHB. We applied a range of preprocessing techniques to refine the dataset and prepare it for AI applications. Building on this, we developed a novel AI-based method, where we trained a deep learning model using a multi-convolutional layer network combined with a Fourier transformation layer. Additionally, an attention mechanism and advanced feature extraction techniques were incorporated to further boost model performance. As a result, the proposed approach achieved an impressive top accuracy of 92.3% in detecting PSHB infections, showing the potential of AI to offer a simple, efficient, and highly accurate solution for early disease detection.
Rabbiya Younas mail , Hafiz Muhammad Raza ur Rehman mail , Gyu Sang Choi mail , Ángel Gabriel Kuc Castilla mail angel.kuc@uneatlantico.es, Carlos Eduardo Uc Ríos mail carlos.uc@unini.edu.mx, Imran Ashraf mail ,
Younas
<a class="ep_document_link" href="/28572/1/s41598-026-47906-8.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Correction: Enhancing fault detection in new energy vehicles via novel ensemble approach
In the original version of this Article, Umair Shahid was incorrectly listed as a corresponding author. The correct corresponding authors for this Article are Imran Ashraf and Kashif Munir. Correspondence and request for materials should be addressed to ashrafimran@live.com and kashif.munir@kfueit.edu.pk.
Iqra Akhtar mail , Mahnoor Nabeel mail , Umair Shahid mail , Kashif Munir mail , Ali Raza mail , Irene Delgado Noya mail irene.delgado@uneatlantico.es, Santos Gracia Villar mail santos.gracia@uneatlantico.es, Imran Ashraf mail ,
Akhtar
<a class="ep_document_link" href="/27825/1/s41598-026-39196-x_reference.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Histopathological evaluation is necessary for the diagnosis and grading of prostate cancer, which is still one of the most common cancers in men globally. Traditional evaluation is time-consuming, prone to inter-observer variability, and challenging to scale. The clinical usefulness of current AI systems is limited by the need for comprehensive pixel-level annotations. The objective of this research is to develop and evaluate a large-scale benchmarking study on a weakly supervised deep learning framework that minimizes the need for annotation and ensures interpretability for automated prostate cancer diagnosis and International Society of Urological Pathology (ISUP) grading using whole slide images (WSIs). This study rigorously tested six cutting-edge multiple instance learning (MIL) architectures (CLAM-MB, CLAM-SB, ILRA-MIL, AC-MIL, AMD-MIL, WiKG-MIL), three feature encoders (ResNet50, CTransPath, UNI2), and four patch extraction techniques (varying sizes and overlap) using the PANDA dataset (10,616 WSIs), yielding 72 experimental configurations. The methodology used distributed cloud computing to process over 31 million tissue patches, implementing advanced attention mechanisms to ensure clinical interpretability through Grad-CAM visualizations. The optimum configuration (UNI2 encoder with ILRA-MIL, 256 256 patches, 50% overlap) achieved 78.75% accuracy and 90.12% quadratic weighted kappa (QWK), outperforming traditional methods and approaching expert pathologist-level diagnostic capability. Overlapping smaller patches offered the best balance of spatial resolution and contextual information, while domain-specific foundation models performed noticeably better than generic encoders. This work is the first large-scale, comprehensive comparison of weekly supervised MIL methods for prostate cancer diagnosis and grading. The proposed approach has excellent clinical diagnostic performance, scalability, practical feasibility through cloud computing, and interpretability using visualization tools.
Naveed Anwer Butt mail , Dilawaiz Sarwat mail , Irene Delgado Noya mail irene.delgado@uneatlantico.es, Kilian Tutusaus mail kilian.tutusaus@uneatlantico.es, Nagwan Abdel Samee mail , Imran Ashraf mail ,
Butt
<a href="/27915/1/csbj.0023.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
This systematic literature review (SLR) investigates the integration of deep learning (DL), vision-language models(VLMs), and multi-agent systems in the analysis of pathology images and automated report generation. The rapidadvancement of whole-slide imaging (WSI) technologies has posed new challenges in pathology, especially due to thescale and complexity of the data. DL techniques in general and convolutional neural networks (CNNs) and transform-ers in particular have significantly enhanced image analysis tasks including segmentation, classification, and detection.However, these models often lack generalizability to generate coherent, clinically relevant text, thus necessitating theintegration of VLMs and large language models (LLMs). This review examines the effectiveness of VLMs and LLMsin bridging the gap between visual data and clinical text, focusing on their potential for automating the generationof pathology reports. Additionally, multi-agent systems, which leverage specialized artificial intelligence (AI) agentsto collaboratively perform diagnostic tasks, are explored for their contributions to improving diagnostic accuracy andscalability. Through a synthesis of recent studies, this review highlights the successes, challenges, and future direc-tions of these AI technologies in pathology diagnostics, offering a comprehensive foundation for the development ofintegrated, AI-driven diagnostic workflows.
Usama Ali mail , Imran Shafi mail , Jamil Ahmad mail , Arlette Zárate Cáceres mail , Thania Chio Montero mail , Hafiz Muhammad Raza ur Rehman mail , Imran Ashraf mail ,
Ali
