Music Genre Classification with Multi-Modal Properties of Lyrics and Spectrograms
Main Article Content
Abstract
Music Genre classification i s w idely u sed i n online music streaming platforms. Deep learning has enabled extracting musical information more effectively, and there have been various research works done to improve their accuracy with power spectrogram images and lyrical features. This paper evaluated the optimum usage of multiple modalities such as lyrics and spectrogram images based on the richness of their features. Furthermore, it proposes a hybrid-fusion-based deep learning multi-modal, multi-class classifier, t hat e mploys Mel Spectrograms, Mel-Frequency Cepstral Coefficients, a nd Lyrics to classify musical genres more accurately. Finally, the proposed model benchmarked with 3 previous studies, with a prepossessed dataset from the Music4All dataset with country, jazz, metal, and pop genre classes and obtained the highest F1-Score of 0.72 for the proposed model.