Network Embedding Techniques for Predicting Software Defects: A Review

Sweta Mehta; Pankaj K. Goswami; K. Sridhar Patnaik

doi:10.18535/ijsrm/v13i06.ec05

Abstract

In the software development process, ensuring the quality of the software is essential. Software defect prediction (SDP) is of significant importance in identifying software modules with a high likelihood of defects. Several machine learning-based defect prediction models have been developed and implemented in recent years. Researchers have also utilized network embedding for SDP, showcasing the adaptability of Natural Language Processing techniques within the domain of defect prediction. This study aims to review, investigate, and discuss network embedding's use in SDP. We examined the previous 15 years' defect prediction articles using network embedding, the majority of which were published in notable conferences and software engineering journals. Each network embedding technique, its findings, and its particular roles in SDP have been described in detail. The papers that have been reviewed are listed in the order of publication along with their comparative assessment. We have developed three research questions that emphasize the significance of analyzing network representations, particularly network embedding, for identifying potential software defects. According to our knowledge, this review is the first to include a thorough analysis of both the transductive and inductive variants of network embedding, along with their potential in machine learning (ML) for predicting software defects. This article extensively explores the challenges and puts forth potential research directions as solutions, intending to effectively guide future research efforts for academics and practitioners in the field of SDP.

Keywords

Software Defect Prediction
Network Embedding
Machine Learning
Software Dependency

References

Alharthi, Z. S., Alsaeedi, A., & Yafooz, W. M. S. (2021). Software defect prediction approaches: A review. In Proceedings of the 4th International Conference on Bio-Engineering for Smart Technologies (pp. 1-6). https://doi.org/10.1109/BioSMART54244.2021.9677869
Ali, Z., Qi, G., Muhammad, K., Ali, B., & Abro, W. A. (2020). Paper recommendation based on heterogeneous network embedding. Knowledge-Based Systems, 210, 106438. https://doi.org/10.1016/j.knosys.2020.106438
Bahaweres, R. B., Jumral, D., Hermadi, I., Suroso, A. I., & Arkeman, Y. (2021). Hybrid software defect prediction based on LSTM (Long Short Term Memory) and word embedding. In Proceedings of the 2nd International Conference On Smart Cities, Automation & Intelligent Computing Systems (pp. 70-75). https://doi.org/10.1109/ICON-SONICS53103.2021.9617182
Hossain, M., & Chen, H. (2022). Application of Machine Learning on Software Quality Assurance and Testing: A Chronological Survey. International Journal of Computers and their Applications, 29(3), 150-157.
Cai, H., Zheng, V., & Chang, K. (2018). A comprehensive survey of graph embedding: Problems, Techniques, and Applications. IEEE Transactions on Knowledge & Data Engineering, 30(9), 1616-1637. https://doi.org/10.1109/TKDE.2018.2807452
Cao, S., Lu, W., & Xu, Q. (2015). Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (pp. 891-900). ACM. https://doi.org/10.1145/2806416.2806512
Cao, S., Lu, W., & Xu, Q. (2016). Deep neural networks for learning graph representations. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (pp. 1145-1152). AAAI Press.
Chen, H., Su, X., Tian, Y., Perozzi, B., Chen, M., & Skiena, S. (2018). Enhanced network embeddings via exploiting edge labels. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (pp. 4 pages). https://doi.org/10.1145/3269206.3269270
Chen, L., Ma, W., Zhou, Y., Xu, L., Wang, Z., Chen, Z., & Xu, B. (2016). Empirical analysis of network measures for predicting high severity software faults. Science China Information Sciences, 59, Article 122901. https://doi.org/10.1007/s11432-015-5426-3
Coscia, J. L. O., Crasso, M., Mateos, C., & Zunino, A. (2012). Estimating Web service interface complexity and quality through conventional object-oriented metrics. In 15th Ibero-American Conference on Software Engineering. https://doi.org/10.19153/cleiej.16.1.4
Coscia, J. L. O., Crasso, M., Mateos, C., Zunino, A., & Misra, S. (2012). Predicting web service maintainability via object-oriented metrics: A statistics-based approach. Computational Science and Its Applications, Lecture Notes in Computer Science, 7336. https://doi.org/10.1007/978-3-642-31128-4_3
Dai, Q., Shen, X., Zhang, L., Li, Q., & Wang, D. (2019). Adversarial Training Methods for Network Embedding. In Proceedings of the World Wide Web Conference (pp. 329-339). https://doi.org/10.1145/3308558.3313445
Dong, T., Shi, H., Zhu, Y., Li, K., Chai, F., & Wang, Y. (2019). Embedded software reliability prediction based on software life cycle. In Proceedings of the IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (pp. 725-729). https://doi.org/10.1109/ISKE47853.2019.9170437
Dong, Y., Chawla, N. V., & Swami, A. (2017). metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 135-144). https://doi.org/10.1145/3097983.3098036
Dong, Y., Tang, Y., Cheng, X., Yang, Y., & Wang, S. (2023). SedSVD: Statement-level software vulnerability detection based on Relational Graph Convolutional Network with subgraph embedding. Information and Software Technology, 158. https://doi.org/10.1016/j.infsof.2023.107168
Du, X., Wang, T., Wang, L., Pan, W., Chai, C., Xu, X., Jiang, B., & Wang, J. (2022). CoreBug: Improving effort-aware bug prediction in software systems using generalized k-core decomposition in class dependency networks. Axioms, 11(5), 205. https://doi.org/10.3390/axioms11050205
Du, X., Yan, J., Zhang, R., & Zha, H. (2022). Cross-Network Skip-Gram Embedding for Joint Network Alignment and Link Prediction. IEEE Transactions on Knowledge and Data Engineering, 34(3), 1080-1095. https://doi.org/10.1109/TKDE.2020.2997861
Fan, G., Diao, X., Yu, H., Yang, K., & Chen, L. (2019). Deep semantic feature learning with embedded static metrics for software defect prediction. In Proceedings of the 26th Asia-Pacific Software Engineering Conference (pp. 244-251). https://doi.org/10.1109/APSEC48747.2019.00041
Gao, H., Lu, M., Pan, C., & Xu, B. (2019). Empirical Study: Are complex network features suitable for cross-version software defect prediction? In Proceedings of the IEEE 10th International Conference on Software Engineering and Service Science (pp. 1-5). https://doi.org/10.1109/ICSESS47205.2019.9040793
Gong, L., Rajbahadur, G. K. K., Hassan, A. E., & Jiang, S. (2021). Revisiting the impact of dependency network metrics on software defect prediction. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2021.3131950
Goyal, P., & Ferrara, E. (2018). Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 151, 78-94. https://doi.org/10.1016/j.knosys.2018.03.022
Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd International Conference on Knowledge Discovery & Data Mining (pp. 855-864). https://doi.org/10.1145/2939672.2939754
Gurung, S. (2022). Performing software defect prediction using deep learning. Computer and Information Science, 1697. Springer. https://doi.org/10.1007/978-3-031-22405-8_25
Halstead, M. H. (1977). Elements of software science (Operating and programming systems series).
Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods and Applications. IEEE Data Engineering, 40(3), 52-74. arXiv:1709.05584
Hamilton, W. L., Ying, Z., & Leskovec, J. (2017). Inductive representation learning on large graphs. In Proceedings of the 28th International Conference on Neural Information Processing Systems (pp. 1025-1035). https://doi.org/10.48550/arXiv.1706.02216
Harrison, R., Counsell, S. J., & Nithi, R. V. (1998). An evaluation of the mood set of object-oriented software metrics. IEEE Transactions on Software Engineering, 24(6), 491-496. https://doi.org/10.1109/32.689404
Hou, M., Ren, J., Zhang, D., Kong, X., Zhang, D., & Xia, F. (2020). Network embedding: Taxonomies, frameworks and applications. Computer Science Review, 38, 100296. https://doi.org/10.1016/j.cosrev.2020.100296
Huo, X., Yang, Y., Li, M., & Zhan, D. (2018). Learning semantic features for software defect prediction by code comments embedding. In Proceedings of the IEEE International Conference on Data Mining (pp. 1049-1054). https://doi.org/10.1109/ICDM.2018.00133
Jureczko, M., & Spinellis, D. (2010). Using object-oriented design metrics to predict software defects. Models and Methods of System Dependability (pp. 69-81). Oficyna Wydawnicza Politechniki Wrocławskiej.
Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations (pp. 1-14). arXiv:1609.02907
Li, N., Liu, J., He, Z., Zhang, C., & Xie, J. (2022). Network Embedding with dual generation tasks. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2022.3187851
Li, T., Zhang, J., Yu, P. S., Zhang, Y., & Yan, Y. (2018). Deep dynamic network embedding for link prediction. IEEE Access, 6, 29219-29230. https://doi.org/10.1109/ACCESS.2018.2839770
Ma, W., Chen, L., Yang, Y., Zhou, Y., & Xu, B. (2016). Empirical analysis of network measures for effort-aware fault-proneness prediction. Information and Software Technology, 69, 50-70. https://doi.org/10.1016/j.infsof.2015.09.001
McCabe, T. J. (1976). A complexity measure. IEEE Transactions on Software Engineering, 2(4), 308-320. https://doi.org/10.1109/TSE.1976.233837
Narayana, A., Chandramohan, M., Venkatesan, R., Chen, L., Liu, Y., & Jaiswal, S. (2017). graph2vec: Learning distributed representations of graphs. arXiv:1707.05005
Nguyen, T. H. D., Adams, B., & Hassan, A. E. (2010). Studying the impact of dependency network measures on software quality. In Proceedings of the IEEE International Conference on Software Maintenance (pp. 1-10). https://doi.org/10.1109/ICSM.2010.5609560
Ou, M., Cui, P., Pei, J., Zhang, Z., & Zhu, W. (2016). Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1105-1114). https://doi.org/10.1145/2939672.2939751
Pan, W., Ming, H., Yang, Z., & Wang, T. (2022). Comments on using k-core decomposition on class dependency networks to improve bug prediction model's practical performance. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2022.3140599
Pereira, J., Groen, A. K., Stroes, E. S. G., & Levin, E. (2019). Graph space embedding. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (pp. 3253-3259). https://doi.org/10.24963/ijcai.2019/451
Perozzi, B., Kulkarni, V., & Skiena, S. (2016). Walklets: Multiscale graph embeddings for interpretable network classification. ArXiv:abs/1605.02115.
Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge discovery and data mining (pp. 701-710). https://doi.org/10.1145/2623330.2623732
Pinzger, M., Nagappan, N., & Murphy, B. (2008). Can developer-module networks predict failures? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (pp. 2-12). https://doi.org/10.1145/1453101.1453105
Premraj, R., & Herzig, K. (2011). Network versus code metrics to predict defects: A replication study. In International Symposium on Empirical Software Engineering and Measurement (pp. 215-224). https://doi.org/10.1109/ESEM.2011.30
Qiu, J., Yuxiao, D., Ma, H., Li, J., Wang, K., & Tang, J. (2018). Network embedding as matrix factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proceedings of the 11th ACM Int. Conf. on Web Search and Data Mining (pp. 459-467). https://doi.org/10.1145/3159652.3159706
Qu, Y., Liu, T., Chi, J., Jin, Y., Cui, D., He, A., Zheng, Q. (2018). Node2defect: using network embedding to improve software defect prediction. In Proceedings of the 33rd ACM/IEEE Int. Conf. on Automated Software Engineering (pp. 844-849). https://doi.org/10.1145/3238147.3240469
Qu, Y., & Yin, H. (2021). Evaluating network embedding techniques' performances in software bug prediction. Empirical Software Engineering, 26, 60. https://doi.org/10.1007/s10664-021-09965-5
Qu, Y., Zheng, Q., Chi, J., Jin, Y., He, A., Cui, D., Zhang, H., & Liu. (2021). Using K-core Decomposition on Class Dependency Networks to improve bug prediction model's practical performance. IEEE Transactions on Software Engineering, 47, 348-366. https://doi.org/10.1109/TSE.2019.2892959
Ribeiro, L. F. R., Saverese, P. H., & Figueiredo, D. R. (2017). Struc2vec: Learning node representations from structural identity. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 385-394). https://doi.org/10.1145/3097983.3098061
Shen, X., Pan, S., Liu, W., Ong, Y., & Sun, Q. (2018). Discrete network embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (pp. 3549-3555).
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web (pp. 1067-1077). https://doi.org/10.1145/2736277.2741093
Tang, S., Meng, Z., & Liang, S. (2022). Dynamic Co-Embedding Model for temporal attributed networks. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2022.3193564
Tang, W., Tang, M., Ban, M., Zhao, Z., & Feng, M. (2023). CSGVD: A deep learning approach combining sequence and graph embedding for source code vulnerability detection. Journal of Systems and Software, 199. https://doi.org/10.1016/j.jss.2023.111623
Tong, H., Liu, B., & Wang, S. (2019). Kernel spectral embedding transfer ensemble for heterogeneous defect prediction. IEEE Transactions on Software Engineering, 47(9), 1886-1906. https://doi.org/10.1109/TSE.2019.2939303
Tosun, A., Turhan, B., & Bener, A. (2009). Validation of network measures as indicators of defective modules in software systems. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (pp. 1-9). https://doi.org/10.1145/1540438.1540446
Wang, D., Cui, P., & Zhu, W. (2016). Structural deep network embedding. In Proceedings of the 22nd Int. Conf. on Knowledge Discovery and Data Mining (pp. 1225-1234). https://doi.org/10.1145/2939672.2939753
Wang, X., Lu, L., Wang, B., Shang, Y., & Yang, H. (2022). Software defect prediction via GIN with hybrid graphical features. In IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion, 411-416. https://doi.org/10.1109/QRS-C57518.2022.00066
Wang, Z., Ye, X., Wang, C., Cui, J., & Yu, P. S. (2021). Network embedding with completely-imbalanced labels. IEEE Transactions on Knowledge and Data Engineering, 33(11), 3634-3647. https://doi.org/10.1109/TKDE.2020.2971490
Xie, Y., Yu, B., Lv, S., Zhang, C., Wang, G., & Gong, G. (2021). A survey on heterogeneous network representation learning. Pattern Recognition, 116, 107936. https://doi.org/10.1016/j.patcog.2021.107936
Xu, J., Ai, J., & Shi, T. (2021). Software Defect Prediction for Specific Defect Types based on Augmented Code Graph Representation. In Proceedings of the Conference on Dependable Systems and Their Applications (pp. 669-678). https://doi.org/10.1109/DSA52907.2021.00097
Yang, C., Shi, C., Liu, Z., Tu, C., & Sun, M. (2021). Network Embedding: Theories, methods, and applications. Springer Cham.
Yang, F., Huang, Y., Xu, H., Xiao, P., & Zheng, W. (2022). Fine-Grained software defect prediction based on the method-call sequence. Computational Intelligence and Neuroscience, 4311548. https://doi.org/10.1155/2022/4311548
Yang, F., Xu, H., Xiao, P., Zhong, F., & Zeng, G. (2023). A Method-Level defect prediction approach based on structural features of method-calling network. IEEE Access, 11, 7933-7946. https://doi.org/10.1109/ACCESS.2023.3239266
Yang, Y., Ai, J., & Wang, F. (2018). Defect prediction based on the characteristics of multilayer structure of software network. In Proceedings of the IEEE International Conference on Software Quality, Reliability and Security Companion (pp. 27-34). https://doi.org/10.1109/QRS-C.2018.00019
Yang, Y., Harman, M., Krinke, J., Islam, S., Binkley, D., Zhou, Y., & Xu, B. (2016). An empirical study on dependence clusters for effort-aware fault-proneness prediction. In Proceedings of the 31st IEEE/ACM Int. Conf. on Automated Software Engineering (pp. 296-307).
Yang, Z., Cohen, W. W., & Salakhutdinov, R. (2016). Revisiting semi-supervised learning with graph embeddings. In Proceedings of the 33rd Int. Conf. on Int. Conf. on Machine Learning (pp. 40-48). https://doi.org/10.48550/arXiv.1603.08861
Zeng, C., Zhou, C. Y., Lv, S. K., He, P., & Huang, J. (2021). GCN2defect: Graph Convolutional Networks for SMOTETomek-based software defect prediction. In IEEE 32nd International Symposium on Software Reliability Engineering (pp. 69-79). https://doi.org/10.1109/ISSRE52982.2021.00020
Zhang, D., Yin, J., Zhu, X., & Zhang, C. (2021). Search efficient binary network embedding. ACM Transactions on Knowledge Discovery and Data, 15(4), Article 61, 1-27. https://doi.org/10.1145/3436892
Zhang, J., Dong, Y., Wang, Y., Tang, J., & Ding, M. (2019). ProNE: Fast and scalable network representation learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (pp. 4278-4284). https://doi.org/10.24963/ijcai.2019/594
Zhu, W., Wang, X., & Cui, P. (2020). Deep Learning for learning graph representations. W. Pedrycz & S. M. Chen (Eds.), Deep Learning: Concepts and Architectures. Studies in Computational Intelligence, 866, 99-115. https://doi.org/10.1007/978-3-030-31756-0_6
Zimmermann, T., & Nagappan, N. (2008). Predicting defects using network analysis on dependency graphs. In Proceedings of the ACM/IEEE 30th Int. Conf. on Software Engineering (pp. 531-540). https://doi.org/10.1145/1368088.1368161

[refR-1] Alharthi, Z. S., Alsaeedi, A., & Yafooz, W. M. S. (2021). Software defect prediction approaches: A review. In Proceedings of the 4th International Conference on Bio-Engineering for Smart Technologies (pp. 1-6). https://doi.org/10.1109/BioSMART54244.2021.9677869

[refR-2] Ali, Z., Qi, G., Muhammad, K., Ali, B., & Abro, W. A. (2020). Paper recommendation based on heterogeneous network embedding. Knowledge-Based Systems, 210, 106438. https://doi.org/10.1016/j.knosys.2020.106438

[refR-3] Bahaweres, R. B., Jumral, D., Hermadi, I., Suroso, A. I., & Arkeman, Y. (2021). Hybrid software defect prediction based on LSTM (Long Short Term Memory) and word embedding. In Proceedings of the 2nd International Conference On Smart Cities, Automation & Intelligent Computing Systems (pp. 70-75). https://doi.org/10.1109/ICON-SONICS53103.2021.9617182

[refR-4] Hossain, M., & Chen, H. (2022). Application of Machine Learning on Software Quality Assurance and Testing: A Chronological Survey. International Journal of Computers and their Applications, 29(3), 150-157.

[refR-5] Cai, H., Zheng, V., & Chang, K. (2018). A comprehensive survey of graph embedding: Problems, Techniques, and Applications. IEEE Transactions on Knowledge & Data Engineering, 30(9), 1616-1637. https://doi.org/10.1109/TKDE.2018.2807452

[refR-6] Cao, S., Lu, W., & Xu, Q. (2015). Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (pp. 891-900). ACM. https://doi.org/10.1145/2806416.2806512

[refR-7] Cao, S., Lu, W., & Xu, Q. (2016). Deep neural networks for learning graph representations. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (pp. 1145-1152). AAAI Press.

[refR-8] Chen, H., Su, X., Tian, Y., Perozzi, B., Chen, M., & Skiena, S. (2018). Enhanced network embeddings via exploiting edge labels. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (pp. 4 pages). https://doi.org/10.1145/3269206.3269270

[refR-9] Chen, L., Ma, W., Zhou, Y., Xu, L., Wang, Z., Chen, Z., & Xu, B. (2016). Empirical analysis of network measures for predicting high severity software faults. Science China Information Sciences, 59, Article 122901. https://doi.org/10.1007/s11432-015-5426-3

[refR-10] Coscia, J. L. O., Crasso, M., Mateos, C., & Zunino, A. (2012). Estimating Web service interface complexity and quality through conventional object-oriented metrics. In 15th Ibero-American Conference on Software Engineering. https://doi.org/10.19153/cleiej.16.1.4

[refR-11] Coscia, J. L. O., Crasso, M., Mateos, C., Zunino, A., & Misra, S. (2012). Predicting web service maintainability via object-oriented metrics: A statistics-based approach. Computational Science and Its Applications, Lecture Notes in Computer Science, 7336. https://doi.org/10.1007/978-3-642-31128-4_3

[refR-12] Dai, Q., Shen, X., Zhang, L., Li, Q., & Wang, D. (2019). Adversarial Training Methods for Network Embedding. In Proceedings of the World Wide Web Conference (pp. 329-339). https://doi.org/10.1145/3308558.3313445

[refR-13] Dong, T., Shi, H., Zhu, Y., Li, K., Chai, F., & Wang, Y. (2019). Embedded software reliability prediction based on software life cycle. In Proceedings of the IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (pp. 725-729). https://doi.org/10.1109/ISKE47853.2019.9170437

[refR-14] Dong, Y., Chawla, N. V., & Swami, A. (2017). metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 135-144). https://doi.org/10.1145/3097983.3098036

[refR-15] Dong, Y., Tang, Y., Cheng, X., Yang, Y., & Wang, S. (2023). SedSVD: Statement-level software vulnerability detection based on Relational Graph Convolutional Network with subgraph embedding. Information and Software Technology, 158. https://doi.org/10.1016/j.infsof.2023.107168

[refR-16] Du, X., Wang, T., Wang, L., Pan, W., Chai, C., Xu, X., Jiang, B., & Wang, J. (2022). CoreBug: Improving effort-aware bug prediction in software systems using generalized k-core decomposition in class dependency networks. Axioms, 11(5), 205. https://doi.org/10.3390/axioms11050205

[refR-17] Du, X., Yan, J., Zhang, R., & Zha, H. (2022). Cross-Network Skip-Gram Embedding for Joint Network Alignment and Link Prediction. IEEE Transactions on Knowledge and Data Engineering, 34(3), 1080-1095. https://doi.org/10.1109/TKDE.2020.2997861

[refR-18] Fan, G., Diao, X., Yu, H., Yang, K., & Chen, L. (2019). Deep semantic feature learning with embedded static metrics for software defect prediction. In Proceedings of the 26th Asia-Pacific Software Engineering Conference (pp. 244-251). https://doi.org/10.1109/APSEC48747.2019.00041

[refR-19] Gao, H., Lu, M., Pan, C., & Xu, B. (2019). Empirical Study: Are complex network features suitable for cross-version software defect prediction? In Proceedings of the IEEE 10th International Conference on Software Engineering and Service Science (pp. 1-5). https://doi.org/10.1109/ICSESS47205.2019.9040793

[refR-20] Gong, L., Rajbahadur, G. K. K., Hassan, A. E., & Jiang, S. (2021). Revisiting the impact of dependency network metrics on software defect prediction. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2021.3131950

[refR-21] Goyal, P., & Ferrara, E. (2018). Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 151, 78-94. https://doi.org/10.1016/j.knosys.2018.03.022

[refR-22] Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd International Conference on Knowledge Discovery & Data Mining (pp. 855-864). https://doi.org/10.1145/2939672.2939754

[refR-23] Gurung, S. (2022). Performing software defect prediction using deep learning. Computer and Information Science, 1697. Springer. https://doi.org/10.1007/978-3-031-22405-8_25

[refR-24] Halstead, M. H. (1977). Elements of software science (Operating and programming systems series).

[refR-25] Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods and Applications. IEEE Data Engineering, 40(3), 52-74. arXiv:1709.05584

[refR-26] Hamilton, W. L., Ying, Z., & Leskovec, J. (2017). Inductive representation learning on large graphs. In Proceedings of the 28th International Conference on Neural Information Processing Systems (pp. 1025-1035). https://doi.org/10.48550/arXiv.1706.02216

[refR-27] Harrison, R., Counsell, S. J., & Nithi, R. V. (1998). An evaluation of the mood set of object-oriented software metrics. IEEE Transactions on Software Engineering, 24(6), 491-496. https://doi.org/10.1109/32.689404

[refR-28] Hou, M., Ren, J., Zhang, D., Kong, X., Zhang, D., & Xia, F. (2020). Network embedding: Taxonomies, frameworks and applications. Computer Science Review, 38, 100296. https://doi.org/10.1016/j.cosrev.2020.100296

[refR-29] Huo, X., Yang, Y., Li, M., & Zhan, D. (2018). Learning semantic features for software defect prediction by code comments embedding. In Proceedings of the IEEE International Conference on Data Mining (pp. 1049-1054). https://doi.org/10.1109/ICDM.2018.00133

[refR-30] Jureczko, M., & Spinellis, D. (2010). Using object-oriented design metrics to predict software defects. Models and Methods of System Dependability (pp. 69-81). Oficyna Wydawnicza Politechniki Wrocławskiej.

[refR-31] Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations (pp. 1-14). arXiv:1609.02907

[refR-32] Li, N., Liu, J., He, Z., Zhang, C., & Xie, J. (2022). Network Embedding with dual generation tasks. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2022.3187851

[refR-33] Li, T., Zhang, J., Yu, P. S., Zhang, Y., & Yan, Y. (2018). Deep dynamic network embedding for link prediction. IEEE Access, 6, 29219-29230. https://doi.org/10.1109/ACCESS.2018.2839770

[refR-34] Ma, W., Chen, L., Yang, Y., Zhou, Y., & Xu, B. (2016). Empirical analysis of network measures for effort-aware fault-proneness prediction. Information and Software Technology, 69, 50-70. https://doi.org/10.1016/j.infsof.2015.09.001

[refR-35] McCabe, T. J. (1976). A complexity measure. IEEE Transactions on Software Engineering, 2(4), 308-320. https://doi.org/10.1109/TSE.1976.233837

[refR-36] Narayana, A., Chandramohan, M., Venkatesan, R., Chen, L., Liu, Y., & Jaiswal, S. (2017). graph2vec: Learning distributed representations of graphs. arXiv:1707.05005

[refR-37] Nguyen, T. H. D., Adams, B., & Hassan, A. E. (2010). Studying the impact of dependency network measures on software quality. In Proceedings of the IEEE International Conference on Software Maintenance (pp. 1-10). https://doi.org/10.1109/ICSM.2010.5609560

[refR-38] Ou, M., Cui, P., Pei, J., Zhang, Z., & Zhu, W. (2016). Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1105-1114). https://doi.org/10.1145/2939672.2939751

[refR-39] Pan, W., Ming, H., Yang, Z., & Wang, T. (2022). Comments on using k-core decomposition on class dependency networks to improve bug prediction model's practical performance. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2022.3140599

[refR-40] Pereira, J., Groen, A. K., Stroes, E. S. G., & Levin, E. (2019). Graph space embedding. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (pp. 3253-3259). https://doi.org/10.24963/ijcai.2019/451

[refR-41] Perozzi, B., Kulkarni, V., & Skiena, S. (2016). Walklets: Multiscale graph embeddings for interpretable network classification. ArXiv:abs/1605.02115.

[refR-42] Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge discovery and data mining (pp. 701-710). https://doi.org/10.1145/2623330.2623732

[refR-43] Pinzger, M., Nagappan, N., & Murphy, B. (2008). Can developer-module networks predict failures? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (pp. 2-12). https://doi.org/10.1145/1453101.1453105

[refR-44] Premraj, R., & Herzig, K. (2011). Network versus code metrics to predict defects: A replication study. In International Symposium on Empirical Software Engineering and Measurement (pp. 215-224). https://doi.org/10.1109/ESEM.2011.30

[refR-45] Qiu, J., Yuxiao, D., Ma, H., Li, J., Wang, K., & Tang, J. (2018). Network embedding as matrix factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proceedings of the 11th ACM Int. Conf. on Web Search and Data Mining (pp. 459-467). https://doi.org/10.1145/3159652.3159706

[refR-46] Qu, Y., Liu, T., Chi, J., Jin, Y., Cui, D., He, A., Zheng, Q. (2018). Node2defect: using network embedding to improve software defect prediction. In Proceedings of the 33rd ACM/IEEE Int. Conf. on Automated Software Engineering (pp. 844-849). https://doi.org/10.1145/3238147.3240469

[refR-47] Qu, Y., & Yin, H. (2021). Evaluating network embedding techniques' performances in software bug prediction. Empirical Software Engineering, 26, 60. https://doi.org/10.1007/s10664-021-09965-5

[refR-48] Qu, Y., Zheng, Q., Chi, J., Jin, Y., He, A., Cui, D., Zhang, H., & Liu. (2021). Using K-core Decomposition on Class Dependency Networks to improve bug prediction model's practical performance. IEEE Transactions on Software Engineering, 47, 348-366. https://doi.org/10.1109/TSE.2019.2892959

[refR-49] Ribeiro, L. F. R., Saverese, P. H., & Figueiredo, D. R. (2017). Struc2vec: Learning node representations from structural identity. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 385-394). https://doi.org/10.1145/3097983.3098061

[refR-50] Shen, X., Pan, S., Liu, W., Ong, Y., & Sun, Q. (2018). Discrete network embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (pp. 3549-3555).

[refR-51] Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web (pp. 1067-1077). https://doi.org/10.1145/2736277.2741093

[refR-52] Tang, S., Meng, Z., & Liang, S. (2022). Dynamic Co-Embedding Model for temporal attributed networks. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2022.3193564

[refR-53] Tang, W., Tang, M., Ban, M., Zhao, Z., & Feng, M. (2023). CSGVD: A deep learning approach combining sequence and graph embedding for source code vulnerability detection. Journal of Systems and Software, 199. https://doi.org/10.1016/j.jss.2023.111623

[refR-54] Tong, H., Liu, B., & Wang, S. (2019). Kernel spectral embedding transfer ensemble for heterogeneous defect prediction. IEEE Transactions on Software Engineering, 47(9), 1886-1906. https://doi.org/10.1109/TSE.2019.2939303

[refR-55] Tosun, A., Turhan, B., & Bener, A. (2009). Validation of network measures as indicators of defective modules in software systems. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (pp. 1-9). https://doi.org/10.1145/1540438.1540446

[refR-56] Wang, D., Cui, P., & Zhu, W. (2016). Structural deep network embedding. In Proceedings of the 22nd Int. Conf. on Knowledge Discovery and Data Mining (pp. 1225-1234). https://doi.org/10.1145/2939672.2939753

[refR-57] Wang, X., Lu, L., Wang, B., Shang, Y., & Yang, H. (2022). Software defect prediction via GIN with hybrid graphical features. In IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion, 411-416. https://doi.org/10.1109/QRS-C57518.2022.00066

[refR-58] Wang, Z., Ye, X., Wang, C., Cui, J., & Yu, P. S. (2021). Network embedding with completely-imbalanced labels. IEEE Transactions on Knowledge and Data Engineering, 33(11), 3634-3647. https://doi.org/10.1109/TKDE.2020.2971490

[refR-59] Xie, Y., Yu, B., Lv, S., Zhang, C., Wang, G., & Gong, G. (2021). A survey on heterogeneous network representation learning. Pattern Recognition, 116, 107936. https://doi.org/10.1016/j.patcog.2021.107936

[refR-60] Xu, J., Ai, J., & Shi, T. (2021). Software Defect Prediction for Specific Defect Types based on Augmented Code Graph Representation. In Proceedings of the Conference on Dependable Systems and Their Applications (pp. 669-678). https://doi.org/10.1109/DSA52907.2021.00097

[refR-61] Yang, C., Shi, C., Liu, Z., Tu, C., & Sun, M. (2021). Network Embedding: Theories, methods, and applications. Springer Cham.

[refR-62] Yang, F., Huang, Y., Xu, H., Xiao, P., & Zheng, W. (2022). Fine-Grained software defect prediction based on the method-call sequence. Computational Intelligence and Neuroscience, 4311548. https://doi.org/10.1155/2022/4311548

[refR-63] Yang, F., Xu, H., Xiao, P., Zhong, F., & Zeng, G. (2023). A Method-Level defect prediction approach based on structural features of method-calling network. IEEE Access, 11, 7933-7946. https://doi.org/10.1109/ACCESS.2023.3239266

[refR-64] Yang, Y., Ai, J., & Wang, F. (2018). Defect prediction based on the characteristics of multilayer structure of software network. In Proceedings of the IEEE International Conference on Software Quality, Reliability and Security Companion (pp. 27-34). https://doi.org/10.1109/QRS-C.2018.00019

[refR-65] Yang, Y., Harman, M., Krinke, J., Islam, S., Binkley, D., Zhou, Y., & Xu, B. (2016). An empirical study on dependence clusters for effort-aware fault-proneness prediction. In Proceedings of the 31st IEEE/ACM Int. Conf. on Automated Software Engineering (pp. 296-307).

[refR-66] Yang, Z., Cohen, W. W., & Salakhutdinov, R. (2016). Revisiting semi-supervised learning with graph embeddings. In Proceedings of the 33rd Int. Conf. on Int. Conf. on Machine Learning (pp. 40-48). https://doi.org/10.48550/arXiv.1603.08861

[refR-67] Zeng, C., Zhou, C. Y., Lv, S. K., He, P., & Huang, J. (2021). GCN2defect: Graph Convolutional Networks for SMOTETomek-based software defect prediction. In IEEE 32nd International Symposium on Software Reliability Engineering (pp. 69-79). https://doi.org/10.1109/ISSRE52982.2021.00020

[refR-68] Zhang, D., Yin, J., Zhu, X., & Zhang, C. (2021). Search efficient binary network embedding. ACM Transactions on Knowledge Discovery and Data, 15(4), Article 61, 1-27. https://doi.org/10.1145/3436892

[refR-69] Zhang, J., Dong, Y., Wang, Y., Tang, J., & Ding, M. (2019). ProNE: Fast and scalable network representation learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (pp. 4278-4284). https://doi.org/10.24963/ijcai.2019/594

[refR-70] Zhu, W., Wang, X., & Cui, P. (2020). Deep Learning for learning graph representations. W. Pedrycz & S. M. Chen (Eds.), Deep Learning: Concepts and Architectures. Studies in Computational Intelligence, 866, 99-115. https://doi.org/10.1007/978-3-030-31756-0_6

[refR-71] Zimmermann, T., & Nagappan, N. (2008). Predicting defects using network analysis on dependency graphs. In Proceedings of the ACM/IEEE 30th Int. Conf. on Software Engineering (pp. 531-540). https://doi.org/10.1145/1368088.1368161

Network Embedding Techniques for Predicting Software Defects: A Review

Abstract

Keywords

References

Author Resources

Journal Policies

Author Desk