Automatic POS tagging of Arabic words using the YAMCHA machine learning tool

Elnily A.; Abdelghany, A.

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/4868

DC Field	Value	Language
dc.contributor.author	Elnily A.	en_US
dc.contributor.author	Abdelghany, A.	en_US
dc.date.accessioned	2023-09-06T03:03:19Z	-
dc.date.available	2023-09-06T03:03:19Z	-
dc.date.issued	2022	-
dc.identifier.isbn	Institute of Electrical and Electronics Engineers Inc	-
dc.identifier.uri	http://hdl.handle.net/123456789/4868	-
dc.description	Scopus	en_US
dc.description.abstract	The process of automatically giving the proper POS tag to each word in a text based on context is known as automatic POS tagging. The majority of NLP applications require this process as a crucial step. This study intends to propose a machine learning-based Arabic POS tagger. YAMCHA tool is the machine learning system employed in this study. YAMCHA utilizes Support Vector Machines as a machine learning algorithm. SVM classifies data with high accuracy because it makes use of part of data in training process. As a result, in order to train the system, a substantial amount of annotated data must be evaluated at the POS level. A corpus of 100,039 words is utilized in this study. It was divided into training and testing parts, totaling 64,608 and 35,431 words, respectively. A tag set of 48 morphological tags were used in training and testing. To reach the best result in the automatic POS tagging, the system was trained multiple times with changing the range of linguistic information used in training process, and then new texts were tested and evaluated. The least error rate achieved was 11.4%. This rate was reached when the preceding word of the target one was considered in the training process without considering its POS tag (F:-10: 0).	en_US
dc.publisher	Institute of Electrical and Electronics Engineers Inc	en_US
dc.subject	machine learning	en_US
dc.subject	POS tagging	en_US
dc.subject	support vector machine	en_US
dc.title	Automatic POS tagging of Arabic words using the YAMCHA machine learning tool	en_US
dc.type	Printed	en_US
dc.relation.conference	Proceedings of the 20th Conference on Language Engineering, ESOLEC 2022	en_US
dc.identifier.doi	10.1109/ESOLEC54569.2022.10009473	-
dc.description.page	72-77	en_US
dc.relation.seminar	20th International Conference on Language Engineering, ESOLEC 2022	en_US
dc.date.seminarstartdate	2022-10-12	-
dc.date.seminarenddate	2022-10-13	-
dc.description.placeofseminar	Cairo	en_US
dc.description.type	Indexed Proceedings	en_US
dc.contributor.correspondingauthor	abdelghany.ma@umk.edu.my	en_US
item.fulltext	No Fulltext	-
item.grantfulltext	none	-
item.openairetype	Printed	-
Appears in Collections:	Faculty of Language Studies and Human Development - Proceedings

Show simple item record

Google Scholar^TM

Check

Google Scholar^TM

Altmetric

Altmetric

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM