Development of an algorithm for code clone detection in source code based on abstract syntax tree
Yevhenii Kubiuk () and
Gennadiy Kyselov
Additional contact information
Yevhenii Kubiuk: National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute»
Gennadiy Kyselov: National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute»
Technology audit and production reserves, 2023, vol. 4, issue 2(72), 33-36
Abstract:
The object of research of this work is the algorithm for searching for duplicates in the program code based on the Abstract Syntaxes Tree (AST). The main tasks solved within the framework of this study are the detection of duplicate code and the search for vulnerabilities in the program code.The obtained results showed that the proposed algorithm is resistant to type 1 and 2 clones, which means its effectiveness in detecting similar code fragments with identical or variant text. However, for type 3 and 4 clones, the algorithm may show less efficiency due to the change in the AST structure for these types of clones.Experimental studies of the proposed algorithm showed that the algorithm can detect matches between unrelated files due to the presence of typical AST chains present in many programs. This can lead to a certain level of false positives in the detection of duplicates.Testing of the algorithm in the task of finding vulnerabilities showed that:The best recognition is observed for the «SQL injection» vulnerability, but it also has the highest number of false positives.Memory leak and null pointer dereferencing vulnerabilities are detected with equal effectiveness and false positives.«Buffer overflow» has the lowest recognition rate but fewer false positives compared to «SQL injection».The study showed that the use of AST allows for the effective detection of duplicate code and vulnerabilities in the software code. The developed tool can help software developers reduce maintenance efforts, improve code quality, and ensure software product security.
Keywords: clone detection; abstract syntax tree; AST; hashing; vulnerability search; false alarms (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.uran.ua/tarp/article/download/286472/280637 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:baq:taprar:v:4:y:2023:i:2:p:33-36
DOI: 10.15587/2706-5448.2023.286472
Access Statistics for this article
More articles in Technology audit and production reserves from PC TECHNOLOGY CENTER
Bibliographic data for series maintained by Iryna Prudius ().