Applying Deep Learning for Morphological Analysis in the Sinhala Language

Yasas D Ekanayaka; Randil Pushpananda; Viraj Welgama; Chamila Liyanage

PDF

Published Dec 12, 2023

Yasas D Ekanayaka

University of Colombo School of Computing, Sri Lanka

Randil Pushpananda

University of Colombo School of Computing, Sri Lanka

Viraj Welgama

University of Colombo School of Computing, Sri Lanka

Chamila Liyanage

Language Technology Research Laboratory , University of Colombo School of Computing, Sri Lanka

Abstract

This research was performed for analyzing morphology of the Sinhala language. Six different deep learning architectures, including RNN, LSTM, and GRU, with and without bidirectional processing was used in the study. Two different datasets, in both Sinhala and Roman scripts, were considered, with each dataset consisting of a total of 644k unique entries. The results were compared to identify the best-performing architecture. Among all the approaches, the model trained with the Sinhala script dataset using bidirectional Gated Recurrent Unit (BiGRU) as the deep learning architecture provided the highest accuracy (87.96%). Several other experiments, such as predicting morphemes and definitions separately, were also considered to assess the behavior of deep learning in morphological analysis in the Sinhala language. All these experiments yielded more than 88% accuracy. These positive results demonstrate the promising potential of deep learning approaches for morphological analysis in the Sinhala language. Using the best performing model, we developed an application for users who are interested in learning and analyzing Sinhala words and their morphology.

Issue

Vol 16 No 2 (2023): 2023 Special Issue (June 2023)

Select the Journal Issue

Articles

Article Sidebar

Main Article Content

Abstract

Article Details