Using Web Scraping for Automatic Generation of Structured Arabic Lexicon

aya mohammed abdul-samad albachari; salma abdul-baqi

doi:10.52940/ijici.v2i2.50

Vol. 2 No. 2 (2023), Articles

Vol. 2 No. 2 (2023)

Using Web Scraping for Automatic Generation of Structured Arabic Lexicon

Articles

https://doi.org/10.52940/ijici.v2i2.50

Published 2024-01-12

aya mohammed abdul-samad albachari⁺⁻
salma abdul-baqi⁺⁻

aya mohammed abdul-samad albachari

southern technical university

salma abdul-baqi

university of basrah

pdf

Keywords

Automatic Arabic Lexicon Generation
Web scraping
Web information extraction

How to Cite

albachari, aya mohammed abdul- samad, & salma abdul-baqi. (2024). Using Web Scraping for Automatic Generation of Structured Arabic Lexicon . Iraqi Journal of Intelligent Computing and Informatics (IJICI), 2(2), 146–152. https://doi.org/10.52940/ijici.v2i2.50

Abstract

Technological development develops every second increasing text data, especially the Arabic texts on the internet. These Arabic data are massive but it is not useful for use because it is unstructured data and it can’t be used for natural language processing (NLP) and its applications. The increase of Arabic language texts on the Internet has led to an increase in Arabic lexicon web pages but it is not ready for use by NLP applications because it is semi-structured or even unstructured lexicons. The method used in this study is web scraping for scrap data from the internet and converting data from unstructured to structured data. This study aims to build an automatic structured Arabic lexicon ready for NLP and its applications using web scraping. which increases the opportunity to use the Arabic language more widely, which is of great importance in natural language processing applications.

https://doi.org/10.52940/ijici.v2i2.50

pdf