Applying Deep Learning for Morphological Analysis in the Sinhala Language

Main Article Content

Yasas D Ekanayaka
Randil Pushpananda
Viraj Welgama
Chamila Liyanage

Abstract

This research was performed for analyzing morphology of the Sinhala language. Six different deep learning architectures, including RNN, LSTM, and GRU, with and without bidirectional processing was used in the study. Two different datasets, in both Sinhala and Roman scripts, were considered, with each dataset consisting of a total of 644k unique entries. The results were compared to identify the best-performing architecture. Among all the approaches, the model trained with the Sinhala script dataset using bidirectional Gated Recurrent Unit (BiGRU) as the deep learning architecture provided the highest accuracy (87.96%). Several other experiments, such as predicting morphemes and definitions separately, were also considered to assess the behavior of deep learning in morphological analysis in the Sinhala language. All these experiments yielded more than 88% accuracy. These positive results demonstrate the promising potential of deep learning approaches for morphological analysis in the Sinhala language. Using the best performing model, we developed an application for users who are interested in learning and analyzing Sinhala words and their morphology.

Article Details

Select the Journal Issue
Articles