Employability and Related Context Prediction Framework for University Graduands: A Machine Learning Approach

Main Article Content

Manushi Prabhavi Wijayapala
Lalith Premaratne
Imali Jayamanne

Abstract

In Sri Lanka (SL), graduands’ employability remains as a national issue due to the increasing number of graduates produced by higher education institutions each year. Thus, predicting the employability of university graduands can mitigate this issue since graduands can identify what qualifications or skills they need to strengthen up in order to find a job of their desired field with a good salary, before they complete the degree.
The main objective of the study is to discover the plausibility of applying machine learning approach efficiently and effectively towards predicting the employability and related context of university graduands in Sri Lanka by proposing an architectural framework which consists of four modules; employment status prediction, job salary prediction, job field prediction and job relevance prediction of graduands while also comparing performance of classification algorithms under each prediction module. Series of machine learning algorithms such as C4.5, Naïve Bayes and AODE have been experimented on the Graduand Employment Census - 2014 data. A pre-processing step is proposed to overcome challenges embedded in graduand employability data and a feature selection process is proposed in order to reduce computational complexity. Additionally, parameter tuning is also done to get the most optimized parameters. More importantly this study utilizes several types of Sampling (Oversampling, Undersampling) and Ensemble (Bagging, Boosting, RF) techniques as well as a newly proposed hybrid approach to overcome the limitations caused by the class imbalance phenomena. For the validation purposes, wide range of evaluation measures was used to analyze the effectiveness of applying classification algorithms and class imbalance mitigation techniques on the dataset. Experimented results indicated that RandomForest has recorded the highest classification performance for 3 modules, achieving the selected best predictive models under hybrid approach having a area under the ROC curve interpretation as an ‘Excellent’ experiment, while a C4.5 Decision Tree model under Ensemble approach has been selected as the best model of the remaining module (Salary Prediction module).

Article Details

Select the Journal Issue
Articles
Author Biography

Manushi Prabhavi Wijayapala, University of Colombo, Faculty of Science

I am a graduate of Bachelor of Science Special degree in Statistics with Computer Science, a joint degree offered by University of Colombo Faculty of Science and UCSC. (First class honors with GPA - 3.89 and Rank - 1st). Further, I am also a graduate of Bachelor of IT degree offered by UCSC (First class honors with GPA - 3.93 and Rank - 1st).