CodeSage-GNN: Cross-Modal Graph Neural Network for Intelligent Software Defect Prediction
Abstract
This study proposes CodeSage-GNN, a cross-modal graph neural framework for predicting defect-prone software modules. CodeSage-GNN represents each file or class as a heterogeneous graph combining structural metrics, dependency relations, and semantic embeddings of source code tokens. A dual-channel message-passing network jointly learns from structural graphs and textual semantics, while an attention-based fusion layer highlights the most influential features contributing to defects. The model is evaluated on multiple open software defect benchmarks to measure F1-score, MCC, AUC, and cross-project generalization. By integrating graph representation learning with interpretable attentions, CodeSage-GNN aims to support early fault localization and quality assurance in large-scale software systems.