EconPapers    
Economics at your fingertips  
 

AlgoLabel: A Large Dataset for Multi-Label Classification of Algorithmic Challenges

Radu Cristian Alexandru Iacob, Vlad Cristian Monea, Dan Rădulescu, Andrei-Florin Ceapă, Traian Rebedea and Ștefan Trăușan-Matu
Additional contact information
Radu Cristian Alexandru Iacob: Department of Computer Science and Engineering, Faculty for Automatic Control and Computers, University Politehnica of Bucharest Splaiul Independentei 313, Sector 6, 060042 Bucharest, Romania
Vlad Cristian Monea: Department of Computer Science and Engineering, Faculty for Automatic Control and Computers, University Politehnica of Bucharest Splaiul Independentei 313, Sector 6, 060042 Bucharest, Romania
Dan Rădulescu: Department of Computer Science and Engineering, Faculty for Automatic Control and Computers, University Politehnica of Bucharest Splaiul Independentei 313, Sector 6, 060042 Bucharest, Romania
Andrei-Florin Ceapă: Department of Computer Science and Engineering, Faculty for Automatic Control and Computers, University Politehnica of Bucharest Splaiul Independentei 313, Sector 6, 060042 Bucharest, Romania
Traian Rebedea: Department of Computer Science and Engineering, Faculty for Automatic Control and Computers, University Politehnica of Bucharest Splaiul Independentei 313, Sector 6, 060042 Bucharest, Romania
Ștefan Trăușan-Matu: Department of Computer Science and Engineering, Faculty for Automatic Control and Computers, University Politehnica of Bucharest Splaiul Independentei 313, Sector 6, 060042 Bucharest, Romania

Mathematics, 2020, vol. 8, issue 11, 1-18

Abstract: While semantic parsing has been an important problem in natural language processing for decades, recent years have seen a wide interest in automatic generation of code from text. We propose an alternative problem to code generation: labelling the algorithmic solution for programming challenges. While this may seem an easier task, we highlight that current deep learning techniques are still far from offering a reliable solution. The contributions of the paper are twofold. First, we propose a large multi-modal dataset of text and code pairs consisting of algorithmic challenges and their solutions, called AlgoLabel. Second, we show that vanilla deep learning solutions need to be greatly improved to solve this task and we propose a dual text-code neural model for detecting the algorithmic solution type for a programming challenge. While the proposed text-code model increases the performance of using the text or code alone, the improvement is rather small highlighting that we require better methods to combine text and code features.

Keywords: text classification; code labeling; multi-modal dataset; multi-label classification; deep learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/8/11/1995/pdf (application/pdf)
https://www.mdpi.com/2227-7390/8/11/1995/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:8:y:2020:i:11:p:1995-:d:441879

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:8:y:2020:i:11:p:1995-:d:441879