Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/1996
DC FieldValueLanguage
dc.contributor.authorOnasoga O.A.en_US
dc.contributor.authorYusoff, Nen_US
dc.contributor.authorHarun N.H.en_US
dc.date.accessioned2021-12-15T01:47:25Z-
dc.date.available2021-12-15T01:47:25Z-
dc.date.issued2021-
dc.identifier.isbn978-303069220-9-
dc.identifier.issn23673370-
dc.identifier.urihttp://hdl.handle.net/123456789/1996-
dc.descriptionScopusen_US
dc.description.abstractAn audio signal is an analogue signal representation in one-dimensional function x(t) with t the continual variable depicting time. Such signals, generated from diverse sources, can be discerned as music, speech, noise or any combination. For machines to understand, these audio signals must be represented such as the extraction of its features which are representations of the composition of the audio signal and behavior over time. Audio feature extraction can enhance the efficacy of audio processing and hence a benefit for numerous applications. We are presenting an emotion classification analysis with reference to audio representation (1 Dimensional and 2 Dimensional) with focus on audio recordings obtainable in Ryerson Audio-Visual Database of Emotion Speech and Song (RAVDESS) dataset, classification is based on eight (8) different emotions. We scrutinize the accuracy evaluation metric on the average of five (5) iterations for each audio signal (raw audio, normalized raw audio and spectrogram) representation. This presents the extraction of features in 1D and 2D as input using the Convolutional Neutral Network (CNN). A Variance of analysis (ANOVA - single factor) analysis was done to test the hypotheses on obtained accuracy values to show significance between the different audio signal representations of the dataset. Results obtained on F-ratio is greater than the critical F-ratio hence this value lies in the critical region. Thus, a shred of evidence that at 0.05 significance level, the true mean of the varied dataset does differ.en_US
dc.language.isoenen_US
dc.publisherSpringer Science and Business Media Deutschland GmbHen_US
dc.subjectANOVAen_US
dc.subjectDeep learningen_US
dc.subjectEmotion detectionen_US
dc.subjectFeature extractionen_US
dc.subjectRAVDESSen_US
dc.titleAudio Classification - Feature Dimensional Analysisen_US
dc.typeNationalen_US
dc.relation.conferenceLecture Notes in Networks and Systemsen_US
dc.identifier.doi10.1007/978-3-030-69221-6_59-
dc.description.page775 - 788en_US
dc.volume194en_US
dc.relation.seminarInternational Conference on Business and Technology, ICBT 2020en_US
dc.date.seminarstartdate2020-11-14-
dc.date.seminarenddate2020-11-15-
dc.description.placeofseminarIstanbulen_US
dc.description.typeIndexed Proceedingsen_US
item.languageiso639-1en-
item.grantfulltextnone-
item.openairetypeNational-
item.fulltextNo Fulltext-
crisitem.author.deptUniversiti Malaysia Kelantan-
crisitem.author.orcid0000-0003-2703-2531-
Appears in Collections:Faculty of Data Science and Computing - Proceedings
Show simple item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.