Exploration of designing an automatic classifier for questions containing code snippets—A case study of Oracle SQL certification exam questions
Hung-Yi Chen,
Po-Chou Shih and
Yunsen Wang
PLOS ONE, 2025, vol. 20, issue 1, 1-24
Abstract:
This study uses the Oracle SQL certification exam questions to explore the design of automatic classifiers for exam questions containing code snippets. SQL’s question classification assigns a class label in the exam topics to a question. With this classification, questions can be selected from the test bank according to the testing scope to assemble a more suitable test paper. Classifying questions containing code snippets is more challenging than classifying questions with general text descriptions. In this study, we use factorial experiments to identify the effects of the factors of the feature representation scheme and the machine learning method on the performance of the question classifiers. Our experiment results showed the classifier with the TF-IDF scheme and Logistics Regression model performed best in the weighted macro-average AUC and F1 performance indices. The classifier with TF-IDF and Support Vector Machine performed best in weighted macro-average Precision. Moreover, the feature representation scheme was the main factor affecting the classifier’s performance, followed by the machine learning method, over all the performance indices.
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0309050 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 09050&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0309050
DOI: 10.1371/journal.pone.0309050
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().