Vol. 2 No. 1 (2022): Journal of AI-Assisted Scientific Discovery
Articles

AI/ML Models for Mitigating False Positives in Large-Scale Security Alert Systems

Sayantan Bhattacharyya
Sayantan Bhattacharyya, Deloitte Consulting, USA
Manish Tomar
Manish Tomar, Citibank, USA
Vincent Kanka
Vincent Kanka, Homesite, USA
Cover

Published 05-03-2022

Keywords

  • false positives,
  • supervised learning,
  • ensemble learning

How to Cite

[1]
Sayantan Bhattacharyya, Manish Tomar, and Vincent Kanka, “AI/ML Models for Mitigating False Positives in Large-Scale Security Alert Systems”, Journal of AI-Assisted Scientific Discovery, vol. 2, no. 1, pp. 528–572, Mar. 2022, Accessed: Jan. 18, 2025. [Online]. Available: https://scienceacadpress.com/index.php/jaasd/article/view/279

Abstract

The proliferation of security alert systems in large-scale enterprise environments has underscored the critical need for mitigating false positives, a persistent challenge that undermines the efficiency and efficacy of Security Operations Centers (SOCs). This paper provides a comprehensive exploration of artificial intelligence (AI) and machine learning (ML) methodologies to address this challenge. Specifically, it examines the application of supervised machine learning models, including ensemble learning algorithms such as Random Forests, Gradient Boosting Machines (GBMs), and XGBoost, alongside deep neural networks (DNNs), to improve the accuracy of threat detection and reduce false positive rates in high-volume security alert systems. The study leverages insights from practical implementations within SOC operations, particularly through tools such as Datadog and Chronicle Security AI.

The research begins by delineating the nature and scale of the false positive problem in modern security infrastructures, emphasizing its detrimental impact on resource allocation, analyst fatigue, and response prioritization. Subsequently, it delves into the theoretical underpinnings and practical considerations of supervised learning models tailored to classify and filter alerts with higher precision. Ensemble learning methods are highlighted for their ability to combine multiple weak learners to form robust predictive models, while DNNs are explored for their capacity to learn intricate patterns and correlations within multidimensional alert data.

A critical analysis of dataset preprocessing techniques, including feature engineering, dimensionality reduction, and class imbalance management, is provided to contextualize the optimal training of ML models. The integration of advanced techniques, such as synthetic minority oversampling (SMOTE) for handling imbalanced datasets, and feature importance metrics for interpretability, is discussed in detail. Furthermore, the paper presents case studies illustrating the deployment of Datadog and Chronicle Security AI in SOC operations, showcasing how these platforms utilize AI/ML to filter, prioritize, and escalate alerts effectively. Practical examples demonstrate the tangible reduction of false positive rates while maintaining high true positive rates, underscoring the models' utility in real-world scenarios.

The evaluation metrics employed include precision, recall, F1-score, and Receiver Operating Characteristic (ROC) curve analysis, providing a robust framework to measure the effectiveness of the proposed solutions. Comparative analysis with traditional rule-based systems highlights the superiority of AI/ML models in adapting to evolving threat landscapes. The paper also examines the limitations and challenges of deploying such models, including computational overhead, data privacy concerns, and adversarial attacks designed to exploit ML vulnerabilities. Strategies to mitigate these challenges, such as model retraining, adversarial robustness techniques, and the adoption of privacy-preserving federated learning, are proposed.

Downloads

Download data is not yet available.

References

  1. Y. Zhang, J. Xie, and Z. Wu, "AI-based intrusion detection systems: A survey," Computers & Security, vol. 87, pp. 101614, Mar. 2020.
  2. R. Gupta, S. Sharma, and V. Gupta, "Reducing false positives in intrusion detection systems using machine learning," IEEE Access, vol. 8, pp. 33251-33260, 2020.
  3. M. Ammar, M. Guizani, and T. El-Gorib, "Machine learning for cybersecurity: A survey and research directions," IEEE Communications Surveys & Tutorials, vol. 22, no. 3, pp. 2083-2117, 2020.
  4. M. A. Islam, A. S. Yassein, and A. R. Al-Ali, "Artificial intelligence and machine learning for security alert classification in cybersecurity," Journal of Computational Science, vol. 43, pp. 101126, May 2020.
  5. L. Wang and S. Wang, "Deep learning for intrusion detection: A comprehensive survey," IEEE Transactions on Industrial Informatics, vol. 16, no. 1, pp. 582-592, Jan. 2020.
  6. S. V. Babu and G. S. Verma, "AI-driven cybersecurity: Improving security alert classification using supervised learning models," IEEE Transactions on Artificial Intelligence, vol. 1, no. 2, pp. 112-122, 2020.
  7. J. P. Singh, "Feature engineering and selection for cybersecurity using machine learning," International Journal of Computer Applications, vol. 178, no. 1, pp. 44-50, Mar. 2021.
  8. E. S. Alsewari, F. H. Ali, and N. F. Yusof, "A survey on machine learning-based anomaly detection techniques for network security," Journal of Network and Computer Applications, vol. 169, p. 102741, 2020.
  9. S. Kumari and R. Bhatt, "A comparative study of intrusion detection systems using machine learning algorithms," Proceedings of the International Conference on Cyber Security and Protection of Digital Services (Cyber Security 2020), pp. 50-59, 2020.
  10. Z. Xu, Q. Z. Sheng, and L. Han, "Reducing false positives in intrusion detection systems using ensemble learning," Proceedings of the International Conference on Security and Privacy in Communication Networks (SecureComm 2021), pp. 88-96, 2021.
  11. M. R. S. J. S. M. Zahedi and S. R. Al-Mousa, "Automating network security monitoring with deep learning: An evaluation of convolutional neural networks," Journal of Computer Security, vol. 28, no. 3, pp. 463-487, Mar. 2022.
  12. A. Sharma, P. Kumar, and R. S. Bedi, "Machine learning models for reducing false positives in threat detection: A study on anomaly detection," Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC 2021), pp. 110-118, 2021.
  13. C. Chen, X. Yang, and H. Li, "A study on the effectiveness of supervised machine learning in cybersecurity incident classification," Proceedings of the International Conference on Intelligent Security (IS 2020), pp. 145-153, 2020.
  14. L. Zhang, W. Guo, and C. Zhang, "A survey on deep learning for cybersecurity: Recent advances and challenges," IEEE Access, vol. 8, pp. 190010-190022, Dec. 2020.
  15. A. S. Dhawan, D. Aggarwal, and K. Dey, "Impact of deep neural networks on false positive reduction in intrusion detection systems," IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 307-318, Mar. 2021.
  16. K. Kim and A. S. Yun, "Integrating artificial intelligence in security alert systems for improved real-time performance," Proceedings of the IEEE International Conference on Big Data and Smart Computing (BigComp 2021), pp. 312-319, 2021.
  17. P. S. Mandal and P. K. Rathi, "Using deep reinforcement learning to improve false positive detection in cybersecurity alert systems," IEEE Transactions on Cybernetics, vol. 51, no. 6, pp. 3097-3108, June 2021.
  18. A. S. Lee and J. D. M. Choi, "Enhancing cybersecurity with AI-driven models for security alerts classification," IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 3, pp. 711-722, 2022.
  19. A. G. K. Nagaraju and B. V. Subrahmanyam, "Classification of security events and alerts using hybrid machine learning models," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 12, pp. 7229-7238, Dec. 2021.
  20. C. G. Saleh and N. S. Daoud, "Optimizing the performance of security alert systems with machine learning for improved anomaly detection," IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 6, no. 4, pp. 891-898, Aug. 2021.