Mohamed Osman, Gani (2024) Enhanced parts of speech (POS) weighted term frequency-inverse document frequency (TF-IDF) in question classification of examination question based on bloom’s taxonomy. Master dissertation/thesis, UTAR.
Abstract
In contemporary educational settings, the traditional practice of utilising examinations as a means of assessing students’ knowledge persists. The creation of a well-crafted question paper is considered an effective method for assessing students’ understanding across various cognitive levels. The examination question classification (EQC) process plays a crucial role in achieving the goal of producing high-quality question papers for assessing students at different cognitive levels. EQC determines the cognitive levels of questions and assigns the cognitive level to questions using Bloom’s Taxonomy (BT) cognitive domain. However, manually assigning cognitive levels is time-consuming, and not all educators possess a thorough understanding of the BT cognitive domain. As a result, the researchers focused on automating the EQC using machine learning (ML) and deep learning (DL) to overcome the aforementioned challenges. Numerous previous studies focused on enhancing the accuracy of the EQC based on BT by enhancing term weighting schemes. However, these studies assigned equal weight to two distinct categories of verbs in the questions: BT action verbs and supporting verbs. It is important to note that BT verbs possess a more significant influence in determining the cognitive level of a question than supporting verbs. Consequently, the primary objective of this study is to introduce the ETFPOS-IDF term weighting model, which assigned a higher weight to BT action verbs than supporting verbs. In addition, the effectiveness of the supervised term weighting (STW) scheme, which has never been addressed before in EQC, was investigated. Furthermore, a comparison was performed between the proposed term weighting model and existing DL models proposed in past studies. This study used three classifiers: support vector machine, artificial neural network, and random forest, and five datasets, three of which were from past studies; one was newly collected, and the fifth was formed by merging the other four datasets. The accuracy and F1 score were utilised as evaluation metrics. The experimental results showed that the proposed term weighting model outperformed both existing term weighting schemes and DL models and achieved an accuracy of 82.8% and an F1 score of 82.9% during cross-validation and 87.1% in both metrics in the train-test split scenario. The outcomes of this study indicated that differentiating between different verb types significantly increases the classification accuracy of examination questions. Regarding STW schemes, this study found no superiority over unsupervised term weighting (USTW) schemes. Future work may involve identifying the optimal weight difference for verb types, hybridising STW and USTW schemes, and exploring the effectiveness of large language models
Actions (login required)