Real Word Spelling Error Detection and Correction for Urdu Language

Article Subjects > Engineering Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Scientific Production
Ibero-american International University > Research > Articles and Books
Universidad Internacional do Cuanza > Research > Scientific Production Open English Non-word and real-word errors are generally two types of spelling errors. Non-word errors are misspelled words that are nonexistent in the lexicon while real-word errors are misspelled words that exist in the lexicon but are used out of context in a sentence. Lexicon-based lookup approach is widely used for non-word errors but it is incapable of handling real-word errors as they require contextual information. Contrary to the English language, real-word error detection and correction for low-resourced languages like Urdu is an unexplored area. This paper presents a real-word spelling error detection and correction approach for the Urdu language. We develop an extensive lexicon of 593,738 words and use this lexicon to develop a dataset for real-word errors comprising 125562 sentences and 2,552,735 words. Based on the developed lexicon and dataset, we then develop a contextual spell checker that detects and corrects real-word errors. For the real-word error detection phase, word-gram features are used along with five machine learning classifiers, achieving a precision, recall, and F1-score of 0.84,0.79, and 0.81 respectively. We also test the proposed approach with a 40% error density. For real-word error correction, the Damerau-Levenshtein distance is used along with the n-gram model for further ranking of the suggested candidate words, achieving an accuracy of up to 83.67%. metadata Aziz, Romila; Anwar, Muhammad Waqas; Jamal, Muhammad Hasan; Bajwa, Usama Ijaz; Kuc Castilla, Ángel Gabriel; Uc-Rios, Carlos; Bautista Thompson, Ernesto and Ashraf, Imran mail UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, carlos.uc@unini.edu.mx, ernesto.bautista@unini.edu.mx, UNSPECIFIED (2023) Real Word Spelling Error Detection and Correction for Urdu Language. IEEE Access. p. 1. ISSN 2169-3536

Text
Real_Word_Spelling_Error_Detection_and_Correction_for_Urdu_Language.pdf
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (3MB)

Official URL: http://doi.org/10.1109/ACCESS.2023.3312730

Abstract

Non-word and real-word errors are generally two types of spelling errors. Non-word errors are misspelled words that are nonexistent in the lexicon while real-word errors are misspelled words that exist in the lexicon but are used out of context in a sentence. Lexicon-based lookup approach is widely used for non-word errors but it is incapable of handling real-word errors as they require contextual information. Contrary to the English language, real-word error detection and correction for low-resourced languages like Urdu is an unexplored area. This paper presents a real-word spelling error detection and correction approach for the Urdu language. We develop an extensive lexicon of 593,738 words and use this lexicon to develop a dataset for real-word errors comprising 125562 sentences and 2,552,735 words. Based on the developed lexicon and dataset, we then develop a contextual spell checker that detects and corrects real-word errors. For the real-word error detection phase, word-gram features are used along with five machine learning classifiers, achieving a precision, recall, and F1-score of 0.84,0.79, and 0.81 respectively. We also test the proposed approach with a 40% error density. For real-word error correction, the Damerau-Levenshtein distance is used along with the n-gram model for further ranking of the suggested candidate words, achieving an accuracy of up to 83.67%.

Document Type:	Article
Keywords:	Real-word errors, spelling correction, spelling detection, spell checker
Subject classification:	Subjects > Engineering
Divisions:	Europe University of Atlantic > Research > Scientific Production Fundación Universitaria Internacional de Colombia > Research > Scientific Production Ibero-american International University > Research > Scientific Production Ibero-american International University > Research > Articles and Books Universidad Internacional do Cuanza > Research > Scientific Production
Deposited:	14 Sep 2023 23:30
Last Modified:	14 Sep 2023 23:30
URI:	https://repositorio.unib.org/id/eprint/8800

Actions (login required)

View Object

open

Infrared thermography to assess fatigue, injury risk factors and recovery in soccer: a systematic review of original studies

Background: Recovery after a training session or match is a key factor in injury prevention and sports performance. The purpose of this systematic review was to analyze and consolidate the available scientific evidence from the main databases on the use of infrared thermography in the assessment of fatigue, injury risk factors, and recovery in soccer players.Methods: The literature search was conducted following the PRISMA guidelines and the PICOS model until June 30, 2025, in the main scientific databases (ScienceDirect, EMBASE, Web of Science (WOS), Cochrane Library, SciELO, MEDLINE/PubMed, SPORTDiscus, and Scopus). The risk of bias and methodological quality were assessed using the Cochrane Handbook guidelines and the PEDro scale.”Results: The initial literature search yielded a total of 510 records. After applying the inclusion and exclusion criteria, the final sample consisted of 20 studies, which were of high methodological quality. The results showed the effects of infrared thermography in assessing fatigue, identifying injury risk factors, and monitoring recovery processes in soccer players. The studies also systematically reported the characterization of the population, the assessment methods used, the variables analyzed, the methodological design, the main results, and the effects of the intervention.Conclusions: Infrared thermography shows promise as a valid, reliable, and non-invasive tool for assessing skin temperature, reflecting temperature changes in response to physiological processes. It allows for the analysis of structural or metabolic fatigue and thermal asymmetries. Therefore, thermography could be used to design individualized recovery protocols.

Producción Científica

Yehinson Barajas Ramón mail , Julio Calleja-González mail , José Luaces-Carreño mail , Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es,

Barajas Ramón

open

Environmental burden of fish in healthy and sustainable diets

Fish is widely promoted as part of healthy dietary patterns. The aim of this review was to summarise current literature on the environmental footprint of fish and its role within sustainable diets. Fish generally represents a minor share of total dietary environmental impacts, contributing to a smaller proportion of greenhouse-gas emissions (GHGe), land and water use than meat and other animal products. Several modelling studies showed that substituting meat with fish or increasing fish intake within optimised dietary patterns can reduce environmental impacts, although the magnitude varies by country, diet type, and fish species. However, some analyses reported increased GHGe associated with higher fish intake, especially in models ensuring nutritional quality. Overall, fish consumption is compatible with achieving nutritionally adequate and lower environmental impacts, although optimal match between environmental boundaries and nutritional needs is not always possible. These findings suggest that fish can play a constructive role in sustainable diets when integrated thoughtfully within broader dietary shifts.

Producción Científica

Alberto Dolci mail , Alessandro Scuderi mail , Evelyn Frias-Toral mail , Leonardo de Jesús Hernández Cruz mail leonardo.hernandez@unib.org, Andrea Di Mauro mail , Fabrizio Furnari mail , Alice Rosi mail , Francesca Scazzina mail , Giuseppe Grosso mail ,

Dolci

open

A novel approach for disease and pests detection in potato production system based on deep learning

Vulnerability of potato crops to diseases and pest infestation can affect its quality and lead to significant yield losses. Timely detection of such diseases can help take effective decisions. For this purpose, a deep learning-based object detection framework is designed in this study to identify and classify major potato diseases and pests under real-world field conditions. A total of 2,688 field images were collected from two research farms in Punjab, Pakistan, across multiple growth stages in various seasonal conditions. Excluding 285 symptoms-free images from the earliest collection led to 2,403 images which were annotated into four biotic-stress classes: blight disease (n = 630), leaf spot disease (n = 370), leafroll virus (viral symptom complex; n = 888), and Colorado potato beetle (larvae/adults; n = 515), indicating class imbalance. Several state-of-the-art models were used including YOLOv8 variants (n/s/m), YOLOv7, YOLOv5, and Faster R-CNN, and the results are discussed in relation to recent potato disease classification studies involving cropped leaf images. Stratified splitting (70% training, 20% validation, 10% testing) was applied to preserve class distribution across all subsets. YOLOv8-medium achieve the best performance with mean average precision (mAP)@0.5 of 98% on the held-out test images. Results for stable 5-fold cross-validation show a mean mAP@0.5 of 97.8%, which offers a balance between accuracy and inference time. Model robustness was evaluated using 5-fold cross-validation and repeated training with different random seeds, showing a low variance of ±0.4% mAP. Results demonstrate promising outcomes under the real-world field conditions, while, broader cross-region and cross-season validation is intended for the future.

Producción Científica

Ahmed Abbas mail , Saif Ur Rehman mail , Khalid Mahmood mail , Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Aseel Smerat mail , Imran Ashraf mail ,

Abbas

open

An attention-based deep learning model for early detection of polyphagous shot hole borer infestations in plants

The Polyphagous Shot Hole Borer (PSHB) is a highly invasive beetle that has been spreading like an epidemic across agricultural and forestry landscapes in recent years. Its rapid and destructive spread has turned it into a major global threat, causing widespread damage that continues to grow with time. Countries like South Africa, the United States, and Australia have implemented extensive measures to control the spread of PSHB, including the establishment of specialized agricultural support centers for early detection. However, there is still a strong need to make PSHB detection more accessible, allowing even non-experts to easily identify infections at an early stage. Artificial Intelligence (AI) has shown great promise in plant disease detection, but a major challenge in the case of PSHB was the lack of a suitable dataset for training AI models. In the proposed work, we first created a dedicated dataset by collecting images of trees infected with PSHB. We applied a range of preprocessing techniques to refine the dataset and prepare it for AI applications. Building on this, we developed a novel AI-based method, where we trained a deep learning model using a multi-convolutional layer network combined with a Fourier transformation layer. Additionally, an attention mechanism and advanced feature extraction techniques were incorporated to further boost model performance. As a result, the proposed approach achieved an impressive top accuracy of 92.3% in detecting PSHB infections, showing the potential of AI to offer a simple, efficient, and highly accurate solution for early disease detection.

Producción Científica

Rabbiya Younas mail , Hafiz Muhammad Raza ur Rehman mail , Gyu Sang Choi mail , Ángel Gabriel Kuc Castilla mail angel.kuc@uneatlantico.es, Carlos Eduardo Uc Ríos mail carlos.uc@unini.edu.mx, Imran Ashraf mail ,

Younas

open

Correction: Enhancing fault detection in new energy vehicles via novel ensemble approach

In the original version of this Article, Umair Shahid was incorrectly listed as a corresponding author. The correct corresponding authors for this Article are Imran Ashraf and Kashif Munir. Correspondence and request for materials should be addressed to ashrafimran@live.com and kashif.munir@kfueit.edu.pk.

Producción Científica

Iqra Akhtar mail , Mahnoor Nabeel mail , Umair Shahid mail , Kashif Munir mail , Ali Raza mail , Irene Delgado Noya mail irene.delgado@uneatlantico.es, Santos Gracia Villar mail santos.gracia@uneatlantico.es, Imran Ashraf mail ,

Akhtar

Links of Interest

Links of Interest

Real Word Spelling Error Detection and Correction for Urdu Language

Abstract

Actions (login required)

SUBJECT

ACCESS

LANGUAGE

Filters