BayesPDGImVD: Bayesian Hyperparameter-Optimized Image-Based Vulnerability Detection via Program Dependency Graph Representation
Downloads
With the increasing complexity of software systems and widespread adoption of open-source components, traditional vulnerability detection approaches face significant bottlenecks in both efficiency and accuracy. Recent advances in machine learning have opened new avenues for intelligent vulnerability detection. This paper presents a Bayesian Hyperparameter-Optimized Image-Based Vulnerability Detection method via Program Dependency Graph Representation (BayesPDGImVD), which innovatively combines program dependency graph (PDG) image representation with Bayesian hyperparameter optimization to effectively overcome limitations of conventional detection methods. The implemented system performs static PDG extraction from C/C++ source code using the Joern analyzer, then constructs multi-channel image features by integrating Sent2Vec semantic embeddings with triple-node centrality metrics (degree, closeness, and Katz centrality). The CNN classifier employs Bayesian optimization to automatically tune critical parameters (learning rate, kernel size, dropout rate, etc.), completely eliminating manual parameter adjustment. Experimental results on the SARD benchmark dataset demonstrate outstanding performance: 86.43% detection accuracy and 80.38% F1-score, with 40% reduced performance fluctuation compared to non-optimized models, validating Bayesian optimization's effectiveness in enhancing model robustness and detection capability. Unlike existing approaches such as VulCNN, our key contribution lies in the organic integration of image-based representation with hyperparameter optimization mechanisms, providing a more interpretable and engineering-practical solution for source code vulnerability detection.
Downloads
1. Uddin MN, Zhang Y, Hei XS. Deep Learning Aided Software Vulnerability Detection: A Survey. ArXiv [Internet]. 2025;abs/2503.04002.
2. Harzevili NS, Belle AB, Wang J, et al. A Survey on Automated Software Vulnerability Detection Using Machine Learning and Deep Learning. ArXiv [Internet]. 2023;abs/2306.11673.
3. Peng T, Gui L, Sun Y. VulMCI : Code Splicing-based Pixel-row Oversampling for More Continuous Vulnerability Image Generation. ArXiv [Internet]. 2024;abs/2402.18189.
4. Fedorchenko E, Novikova E, Fedorchenko A, et al. An Analytical Review of the Source Code Models for Exploit Analysis. Inf. 2023;14:497.
5. Srinivasa G. Abstraction-Based Static Analysis of Buffer Overruns in C Programs. 2003. Available from: https://api.semanticscholar.org/CorpusID:14270976.
6. Peng B, Su P, Liu Z, et al. VulSimple: A Vulnerability Detection Framework Based on Simple Backward Slicing. Proc 4th Int Conf Artif Intell Comput Eng [Internet]. 2023;
7. Bilgin Z. Code2Image: Intelligent Code Analysis by Computer Vision Techniques and Application to Vulnerability Prediction. ArXiv [Internet]. 2021;abs/2105.03131.
8. Wu Y, Zou D, Dou S, et al. VulCNN: an image-inspired scalable vulnerability detection system. Proc 44th Int Conf Softw Eng [Internet]. New York, NY, USA: Association for Computing Machinery; 2022. p. 2365–2376. Available from: https://doi.org/10.1145/3510003.3510229.
9. Yamaguchi F, Golde N, Arp D, et al. Modeling and Discovering Vulnerabilities with Code Property Graphs. 2014 IEEE Symp Secur Priv. 2014. p. 590–604.
10. Li Y, Wang S, Nguyen TN. Vulnerability detection with fine-grained interpretations. Proc 29th ACM Jt Meet Eur Softw Eng Conf Symp Found Softw Eng [Internet]. 2021;
11. Zhou L, Huang M, Li Y, et al. GraphEye: A Novel Solution for Detecting Vulnerable Functions Based on Graph Attention Network. 2021 IEEE Sixth Int Conf Data Sci Cyberspace DSC. 2021;381–388.
12. Li Z, Zou D, Xu S, et al. VulDeePecker: A Deep Learning-Based System for Vulnerability Detection. CoRR [Internet]. 2018;abs/1801.01681.
13. Li Z, Zou D, Xu S, et al. SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities. IEEE Trans Dependable Secure Comput. 2018;19:2244–2258.
14. Zhou Y, Liu S, Siow J, et al. Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Proc 33rd Int Conf Neural Inf Process Syst. Red Hook, NY, USA: Curran Associates Inc.; 2019.
15. Bischl B, Binder M, Lang M, et al. Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. Wiley Interdiscip Rev Data Min Knowl Discov [Internet]. 2021;13.
16. Snoek J, Larochelle H, Adams RP. Practical Bayesian Optimization of Machine Learning Algorithms. Neural Inf Process Syst [Internet]. 2012. Available from: https://api.semanticscholar.org/CorpusID:632197.
17. Turner R, Eriksson D, McCourt MJ, et al. Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020. Neural Inf Process Syst [Internet]. 2021. Available from: https://api.semanticscholar.org/CorpusID:233324399.
18. Pagliardini M, Gupta P, Jaggi M. Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features. North Am Chapter Assoc Comput Linguist [Internet]. 2017. Available from: https://api.semanticscholar.org/CorpusID:16251657.
19. Kim Y. Convolutional Neural Networks for Sentence Classification. Conf Empir Methods Nat Lang Process [Internet]. 2014. Available from: https://api.semanticscholar.org/CorpusID:9672033.
20. Nair V, Hinton GE. Rectified Linear Units Improve Restricted Boltzmann Machines. Int Conf Mach Learn [Internet]. 2010. Available from: https://api.semanticscholar.org/CorpusID:15539264.
21. Shannon CE. The mathematical theory of communication. 1950. Available from: https://api.semanticscholar.org/CorpusID:125327631.
22. Kingma DP, Ba J. Adam: A Method for Stochastic Optimization [Internet]. 2017. Available from: https://arxiv.org/abs/1412.6980.
23. Akiba T, Sano S, Yanase T, et al. Optuna: A Next-generation Hyperparameter Optimization Framework. Proc 25th ACM SIGKDD Int Conf Knowl Discov Data Min [Internet]. 2019;
24. Neal RM. Pattern Recognition and Machine Learning. J Electron Imaging [Internet]. 2006. Available from: https://api.semanticscholar.org/CorpusID:31993898.
Copyright (c) 2025 Xingquan Mao, Zhangpei Huang

This work is licensed under a Creative Commons Attribution 4.0 International License.