Machine Learning-Based Fraud Detection in Banking Transactions Using Integrated Behavioral and Transactional Feature Engineering with Weak Labeling Approach
Main Article Content
Fraud detection in banking transactions is a critical challenge due to the imbalanced nature of data and the lack of labeled fraud instances. This study proposes a machine learning approach for detecting fraudulent transactions by integrating behavioral and transactional features, combined with a rule-based weak labeling strategy to generate fraud labels. The dataset consists of 2,512 banking transactions, with 14.29% labeled as fraud. Three models were evaluated, including Logistic Regression, Random Forest, and XGBoost, using stratified cross-validation and multiple evaluation metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. The results show that ensemble-based models outperform Logistic Regression, with Random Forest achieving the best balance between precision and recall, and XGBoost obtaining perfect recall and the highest ROC-AUC, indicating its strong ability to detect fraudulent transactions. Feature importance analysis reveals that transaction amount and deviation from typical user behavior are key indicators of fraud. Despite these promising results, the study is limited by the use of rule-based labeling and a relatively small dataset. Future work should focus on validating the proposed approach using real-world labeled data and improving model robustness for practical deployment.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.