Intelligent detection of vulnerable functions in software through neural embedding‐based code analysis
Peng Zeng,
Guanjun Lin,
Jun Zhang and
Ying Zhang
International Journal of Network Management, 2023, vol. 33, issue 3
Abstract:
Software vulnerability is a fundamental problem in cybersecurity, which poses severe threats to the secure operation of devices and systems. In this paper, we propose a new vulnerability detection framework of employing advanced neural embedding. For example, CodeBERT is a large‐scale pre‐trained embedding model for natural language and programming language. It achieves state‐of‐the‐art performance on various natural language processing and code analysis tasks, demonstrating improved generalization ability compared with conventional models. The proposed framework encapsulates CodeBERT as a code representation generator and combines it with transfer learning to conduct cross‐project vulnerability detection. Considering the problem of lacking code embedding models on C source code, we extract the knowledge from C source code to fine‐tune the pre‐trained embedding model, so as to better facilitate the detection of function‐level vulnerabilities in C open‐source projects. To address the severe data imbalance issue in real‐world scenarios, we introduce code argumentation idea and use a large number of synthetic vulnerability data to further improve the robustness of the detection method. Experimental results show that the proposed vulnerability detection framework achieves better performance than existing methods.
Date: 2023
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://doi.org/10.1002/nem.2198
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wly:intnem:v:33:y:2023:i:3:n:e2198
Access Statistics for this article
More articles in International Journal of Network Management from John Wiley & Sons
Bibliographic data for series maintained by Wiley Content Delivery ().