AI-Based Test Case Generation for Continuous Integration Pipelines: Combining Large Language Models, Program Analysis, and Reinforcement Learning

ASISH BERA; BHARAT RICHHARIYA; RAGHUNATH REDDY MADIREDDY; AKANKSHA RATHORE

doi:10.64137/31078699/IJETET-V2I2P102

Authors

DR. ASISH BERA Assistant Professor, Department of CS / CSIS Shiv Nadar Institution of Eminence, Greater Noida, Uttar Pradesh. Author
DR. BHARAT RICHHARIYA Assistant Professor, Department of CS / CSIS Shiv Nadar Institution of Eminence, Greater Noida, Uttar Pradesh. Author
DR. RAGHUNATH REDDY MADIREDDY Assistant Professor, Department of CS / CSIS Shiv Nadar Institution of Eminence, Greater Noida, Uttar Pradesh. Author
DR. AKANKSHA RATHORE Assistant Professor, Department of CS / CSIS Shiv Nadar Institution of Eminence, Greater Noida, Uttar Pradesh. Author

DOI:

https://doi.org/10.64137/31078699/IJETET-V2I2P102

Keywords:

AI-Based Testing, Continuous Integration, Large Language Models, Program Analysis, Reinforcement Learning, Test Generation, Mutation Testing, CI/CD, Software Quality Engineering

Abstract

Continuous Integration (CI) pipelines have become the operational backbone of modern software delivery, yet test generation inside CI remains largely reactive, manually maintained, and weakly aligned with rapidly changing code behavior. Conventional automated test generation approaches can improve coverage, but they often struggle with semantic intent, realistic input construction, dependency-aware test scaffolding, and unstable pipeline execution. Recent large language models (LLMs) introduce a new capability for synthesizing readable and context-aware tests, but LLM-only test generation is vulnerable to hallucinated APIs, shallow path exploration, non-executable assertions, and brittle or flaky outputs. This paper presents CI-GenRL, an AI-based test case generation framework for CI pipelines that combines LLM-based test synthesis, static and dynamic program analysis, and reinforcement learning-based policy optimization. The proposed framework observes code changes, extracts dependency and risk signals, generates candidate tests, executes them in isolated CI containers, and uses coverage, mutation score, failure reproduction, flakiness, and execution cost as reward signals. Unlike review-oriented approaches, this work formulates test generation as a sequential decision problem in which each generated test should maximize marginal verification value under CI time constraints. The architecture integrates risk-aware build selection, prompt grounding, path-targeted test synthesis, reward-driven test prioritization, and governance controls for enterprise deployment. A pilot-style evaluation protocol is described across Java and Python services, using branch coverage, mutation adequacy, defect detection, build latency, and flaky-test suppression as primary measures. The illustrative pilot results show that combining LLMs with program analysis and reinforcement learning can produce more executable and higher-value CI test suites than LLM-only or search-only baselines. The paper contributes a CI-native formulation, an end-to-end architecture, an RL reward model, and a deployment governance model for safe adoption of AI-generated tests in regulated software environments.

References

[1] M. Chen et al., “Evaluating Large Language Models Trained on Code,” arXiv:2107.03374 [cs], Jul. 2021, Available: https://arxiv.org/abs/2107.03374

[2] S. D. Sivva, R. R. Thalakanti, S. S. G. Bandari, and S. D. R. Yettapu, “AI-Driven Decision Intelligence for Agile Software Lifecycle Governance: An Architecture-Centered Framework Integrating Machine Learning Defect Prediction and Automated Testing,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 4, pp. 167–172, 2023, doi: https://doi.org/10.63282/3050-9246.ijetcsit-v4i4p118.

[3] S. Yalamati, “Energy-Efficient Task Offloading in Multi-Tenant Edge Clouds,” 2026 International Conference on Electronic Systems and Intelligent Computing (ICESIC), pp. 379–384, Mar. 2026, doi: https://doi.org/10.1109/icesic67389.2026.11496473.

[4] “AI-Driven Fax-to-Digital Prescription Automation: A Cloud-Native Framework Using OCR, Machine Learning, and Microservices for Pharmacy Operations,” International Journal of Emerging Research in Engineering and Technology, vol. 5, no. 1, Mar. 2024, doi: https://doi.org/10.63282/3050-922x.ijeret-v5i1p113.

[5] S. K. Gunda, “A Scalable AI-Driven Quality Engineering Architecture for End-To-End Validation of Core Banking, API, and UAT Ecosystems,” American International Journal of Computer Science and Technology, vol. 7, no. 6, pp. 126–138, Dec. 2025, doi: https://doi.org/10.63282/3117-5481/aijcst-v7i6p113.

[6] G. Fraser and A. Arcuri, “EvoSuite,” Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering - SIGSOFT/FSE ’11, 2011, doi: https://doi.org/10.1145/2025113.2025179.

[7] A. K. K. V. Alluri and S. Barde, “AI-Powered Decision Intelligence Frameworks for Predictive and Prescriptive Business Optimization in Salesforce Enterprise Platforms,” 2026 International Conference on Electronic Systems and Intelligent Computing (ICESIC), pp. 438–443, Mar. 2026, doi: https://doi.org/10.1109/icesic67389.2026.11496409.

[8] R. R. Thalakanti and S. S. G. Bandari, “Intelligent Continuous Integration and Delivery for Banking Systems using Machine Learning Driven Risk Detection with Real World Deployment Evaluation,” International Journal of AI, BigData, Computational and Management Studies, vol. 5, no. 4, pp. 168–175, Dec. 2024, doi: https://doi.org/10.63282/3050-9416.ijaibdcms-v5i4p118.

[9] C. Pacheco, S. K. Lahiri, M. E. Ernst, and T. Ball, “Feedback-Directed Random Test Generation,” Proceedings, May 2007, doi: https://doi.org/10.1109/icse.2007.37.

[10] “Design and Evaluation of Secure Microservices Architecture for HIPAA-Compliant Prescription Processing on AWS and OpenShift,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, no. 2, Jun. 2024, doi: https://doi.org/10.63282/3050-9262.ijaidsml-v5i2p116.

[11] S. Yalamati, “Sparse Matrix Factorization for Scalable Machine Learning in Cloud Environments,” 2025 International Conference on NexGen Networks and Cybernetics (IC2NC), pp. 333–338, Dec. 2025, doi: https://doi.org/10.1109/ic2nc67409.2025.11376338.

[12] S. K. Gunda, “AI-Enhanced API Reliability Testing for Digital Banking: Improving Accuracy, Resilience, and Integrity in Financial Transaction Processing,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 6, no. 2, pp. 136–143, May 2025, doi: https://doi.org/10.63282/3050-9246.ijetcsit-v6i2p116.

[13] S. S. G. Bandari, S. D. Sivva, and R. R. Thalakanti, “Regulatory Grade Fraud Detection using Explainable Artificial Intelligence with Auditable Decision Pathways and Empirical Validation on Banking Data,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, pp. 139–147, 2024, doi: https://doi.org/10.63282/3050-9262.ijaidsml-v5i3p115.

[14] C. Cadar, D. Dunbar, and D. Engler, “KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs.” Available: https://www.usenix.org/legacy/event/osdi08/tech/full_papers/cadar/cadar.pdf

[15] V. K. R. Mittamidi, “AI/ML Powered Intelligent Root Cause Analysis and Automated Remediation for Multi System Data Integrity Issues,” International Journal of AI, BigData, Computational and Management Studies, vol. 6, pp. 133–141, 2025, doi: https://doi.org/10.63282/3050-9416.ijaibdcms-v6i4p115.

[16] R. R. Thalakanti, S. S. G. Bandari, and S. D. Sivva, “Federated Learning for Privacy Preserving Fraud Detection across Financial Institutions: Architecture Protocols and Operational Governance,” International Journal of Emerging Research in Engineering and Technology, vol. 5, pp. 108–114, 2024, doi: https://doi.org/10.63282/3050-922x.ijeret-v5i2p111.

[17] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal Policy Optimization Algorithms,” arXiv.org, Aug. 28, 2017. https://arxiv.org/abs/1707.06347

[18] “Leveraging Predictive Analytics and Redis-Backed Caching to Optimize Specialty Medication Fulfillment and Pharmacy Inventory Management,” International Journal of AI, BigData, Computational and Management Studies, vol. 5, no. 3, Oct. 2024, doi: https://doi.org/10.63282/3050-9416.ijaibdcms-v5i3p116.

[19] S. K. Gunda, “Predictive Validation of Banking APIs and Transaction Workflows Using Machine Learning-Based Defect Detection Model,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 6, no. 1, pp. 284–292, Mar. 2025, doi: https://doi.org/10.63282/3050-9262.ijaidsml-v6i1p133.

[20] Z. Feng et al., “CodeBERT: A Pre-Trained Model for Programming and Natural Languages,” Empirical Methods in Natural Language Processing, Feb. 2020, doi: https://doi.org/10.18653/v1/2020.findings-emnlp.139.

[21] S. Naik, Praneeth Aitharaju, and Sai, “AI Chatbots in Enterprise Solutions: Transforming Customer Support, Industry-Specific Challenges and Ethical Considerations,” vol. 01, no. 01, pp. 49–59, Jan. 2025, doi: https://doi.org/10.63665/gjis.v1.11.

[22] S. Yalamati, “Probabilistic Reasoning in Multi-Agent Reinforcement Learning Systems,” 2025 International Conference on NexGen Networks and Cybernetics (IC2NC), pp. 707–712, Dec. 2025, doi: https://doi.org/10.1109/ic2nc67409.2025.11376303.

[23] Raikar and V. Apelagunta, “Implementing SAP Fiori in S/4HANA Transitions: Key Guidelines, Challenges, Strategic Implications, AI Integration Recommendations,” Journal of Engineering Research and Sciences, vol. 4, no. 11, pp. 1–9, Nov. 2025, doi: https://doi.org/10.55708/js0411001.

[24] S. K. Gunda, “An Intelligent AI-Driven Framework for Real-Time ATM Transaction Validation, Fraud Detection and Financial Switching Integrity,” International Journal of Emerging Research in Engineering and Technology, vol. 5, pp. 180–191, 2024, doi: https://doi.org/10.63282/3050-922x.ijeret-v5i4p119.

[25] S. Gu, N. Nashid, and A. Mesbah, “LLM Test Generation via Iterative Hybrid Program Analysis,” arXiv.org, 2025. https://arxiv.org/abs/2503.13580

[26] A. K. K. V. Alluri, “A Systematic Study of Machine Learning Frameworks Enabling Scalable Secure and Explainable Artificial Intelligence in Salesforce CRM Platforms,” 2026 International Conference on Electronic Systems and Intelligent Computing (ICESIC), pp. 396–401, Mar. 2026, doi: https://doi.org/10.1109/icesic67389.2026.11496486.

[27] “Enhancing Reliability in Java Enterprise Systems through Comparative Analysis of Automated Testing Frameworks,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 4, 2023, doi: https://doi.org/10.63282/3050-9246.ijetcsit-v4i2p115.

[28] Sai Santosh Goud Bandari, “Machine Learning (ML) based Anomaly Detection in Insurance Industries,” Journal of Information Systems Engineering and Management, vol. 10, no. 32s, pp. 13–21, Apr. 2025, doi: https://doi.org/10.52783/jisem.v10i32s.5182.

[29] S. Yalamati, “AI-Augmented Service Fabric for Adaptive Resource Management in Cloud Environments,” 2025 5th International Conference on Ubiquitous Computing and Intelligent Information Systems (ICUIS), pp. 963–968, Nov. 2025, doi: https://doi.org/10.1109/icuis67429.2025.11380548.

[30] “Decision Intelligence Methodology for AI-Driven Agile Software Lifecycle Governance and Architecture-Centered Project Management,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 4, 2023, doi: https://doi.org/10.63282/3050-9262.ijaidsml-v4i1p112.

[31] G. Wang et al., “TestDecision: Sequential Test Suite Generation via Greedy Optimization and Reinforcement Learning,” arXiv.org, 2026. https://arxiv.org/abs/2604.01799.

[32] S. R. Gudi, “Monitoring and Deployment Optimization in Cloud-Native Systems: A Comparative Study Using OpenShift and Helm,” 2025 4th International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), pp. 792–797, Sep. 2025, doi: https://doi.org/10.1109/icimia67127.2025.11200594.

[33] B. Siri and Sai, “Replacing AI Agents for Backend,” INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT, vol. 09, no. 06, pp. 1–8, Jun. 2025, doi: https://doi.org/10.55041/ijsrem.ncft011.

[34] N. Mutyam, “Graph-Based Modeling of Service Dependencies for Predicting Failure Propagation in Distributed Systems,” International Journal of Multidisciplinary Evolutionary Research, vol. 5, no. 1, pp. 113–116, 2024, doi: https://doi.org/10.54660/ijmer.2024.5.1.113-116.

[35] S. K. Gunda, “The Future of Software Development and the Expanding Role of ML Models,” International Journal of Emerging Research in Engineering and Technology, vol. 4, 2023, doi: https://doi.org/10.63282/3050-922x.ijeret-v4i2p113.

[36] R. R. Thalakanti, “Optimizing Neural Network Architecture for Binary Classification Using Evolutionary Algorithms,” 2025 International Conference on Electronics and Computing, Communication Networking Automation Technologies (ICEC2NT), pp. 1–6, Sep. 2025, doi: https://doi.org/10.1109/icec2nt65402.2025.11380048.

[37] S. Yalamati, “Reinforcement Learning for Dynamic Service Composition in Edge Networks,” 2025 4th International Conference on Applied Artificial Intelligence and Computing (ICAAIC), pp. 1158–1163, Dec. 2025, doi: https://doi.org/10.1109/icaaic64647.2025.11330768.

[38] S. R. Gudi, “Ensuring Secure and Compliant Fax Communication: Anomaly Detection and Encryption Strategies for Data in Transit,” 2025 4th International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), pp. 786–791, Sep. 2025, doi: https://doi.org/10.1109/icimia67127.2025.11200537.

[39] R. R. THALAKANTI, “AI-Driven API Architectures for Multi-Cloud Enterprises: A Comparative Study of Centralized, Distributed, and Hybrid Deployment Models,” International Journal of Computer Science and Engineering Innovations, vol. 2, no. 1, pp. 60–67, Feb. 2026, doi: https://doi.org/10.64137/31079458/ijcsei-v2i1p108.

[40] S. K. Gunda, “Comparative Analysis of Machine Learning Models for Software Defect Prediction,” pp. 1–6, Oct. 2024, doi: https://doi.org/10.1109/icpects62210.2024.10780167.

[41] A. K. K. Varma Alluri, “Using Salesforce CRM and Deep Learning (CNN) Techniques to Improve Patient Journey Mapping and Engagement in Small and Medium Healthcare Organizations,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 6, 2025, doi: https://doi.org/10.63282/3050-9262.ijaidsml-v6i4p115.

[42] S. D. Sivva, “An End-to-End AI-Based Systems Engineering Paradigm for Lifecycle Governance, Predictive Quality Assurance, Automation Economics, and Cybersecurity Intelligence,” Journal of Frontiers in Multidisciplinary Research, vol. 4, no. 1, pp. 600–604, 2023, doi: https://doi.org/10.54660/.jfmr.2023.4.1.600-604.

[43] S. R. Gudi, “Enhancing Optical Character Recognition (OCR) Accuracy in Healthcare Prescription Processing using Artificial Neural Networks,” European Journal of Artificial Intelligence and Machine Learning, vol. 4, no. 6, pp. 1–6, Nov. 2025, doi: https://doi.org/10.24018/ejai.2025.4.6.79.

[44] S. K. Gunda, “Fault Prediction Unveiled: Analyzing the Effectiveness of RandomForest, LogisticRegression, and KNeighbors,” 2024 2nd International Conference on Self Sustainable Artificial Intelligence Systems (ICSSAS), pp. 107–113, Oct. 2024, doi: https://doi.org/10.1109/icssas64001.2024.10760620.

[45] M. Balerao, “A Converged Artificial Intelligence Architecture for Innovation, Software Lifecycle Optimization, and Cybersecurity Risk Mitigation,” International Journal of Multidisciplinary Futuristic Development, vol. 4, no. 1, pp. 117–120, 2023, doi: https://doi.org/10.54660/ijmfd.2023.4.1.117-120.

[46] R. R. Thalakanti, “Convergence Analysis and Implementation of Linear Multistep Methods for Solving Ordinary Differential Equations,” 2025 2nd Asian Conference on Intelligent Technologies (ACOIT), pp. 1–18, Oct. 2025, doi: https://doi.org/10.1109/acoit66109.2025.11436783.

[47] S. R. Gudi, “Deconstructing Monoliths: A Fault-Aware Transition to Microservices with Gateway Optimization using Spring Cloud,” 2025 6th International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 815–820, Sep. 2025, doi: https://doi.org/10.1109/icesc65114.2025.11212326.

[48] S. K. Gunda, “Analyzing Machine Learning Techniques for Software Defect Prediction: A Comprehensive Performance Comparison,” 2024 Asian Conference on Intelligent Technologies (ACOIT), pp. 1–5, Sep. 2024, doi: https://doi.org/10.1109/acoit62457.2024.10939610.

[49] T. Raikar, “High-Performance In-Memory Computing: A Research Study on SAP S/4 HANA Database Layer,” American Journal of Technology, vol. 4, no. 2, pp. 93–113, Dec. 2025, doi: https://doi.org/10.58425/ajt.v4i2.449.

[50] V. K. R. Mittamidi, “Leveraging AI and ML for Predictive Monitoring and Error Mitigation in Change Data Capture Pipelines,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 6, pp. 104–111, 2025, doi: https://doi.org/10.63282/3050-9246.ijetcsit-v6i3p116.

[51] R. R. Thalakanti, “Formalizing feature model integrity: a typing system and refactoring approaches for improving software product line design,” IET Conference Proceedings, vol. 2025, no. 43, pp. 710–717, Feb. 2026, doi: https://doi.org/10.1049/icp.2025.4792.

[52] Sai Krishna Gunda, “An Exploration of Adaptive Ensemble Approaches in Software Fault Detection: Balancing Accuracy and Robustness,” The First International Conference on Recent Trends in Artificial Intelligence, Cyber Security, And Embedded Systems: ICRTACES2024, Tiruchirappalli, India, vol. 3345, no. 1, 7 January 2026, https://doi.org/10.1063/5.0298093

[53] “EmoVision: An Intelligent Deep Learning Framework for Emotion Understanding and Mental Wellness Assistance in Human Computer Interaction,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 6, 2025, doi: https://doi.org/10.63282/3050-9262.ijaidsml-v6i4p103.

[54] A. K. K. Varma Alluri, “Salesforce CRM Framework for Real Time DeFi Portfolio Intelligence and Customer Engagement Forecasting in Web3 Based Decentralized Finance Ecosystems Using ML Techniques,” International Journal of AI, BigData, Computational and Management Studies, vol. 6, 2025, doi: https://doi.org/10.63282/3050-9416.ijaibdcms-v6i4p111.

[55] “View of A Comparative Analysis of Pivotal Cloud Foundry and OpenShift Cloud Platforms,” Doi.org, 2026. https://doi.org/10.37547/tajas/Volume07Issue07-03

[56] S. K. Gunda, “Accelerating Scientific Discovery With Machine Learning and HPC-Based Simulations,” Advances in Systems Analysis, Software Engineering, and High Performance Computing, pp. 229–252, Dec. 2024, doi: https://doi.org/10.4018/978-1-6684-3795-7.ch009.

[57] T. Raikar, F. Ezeugboaja, S. Bussa, H. Upadhyay, and P. Kalaru, “Ethics of AI-based supply chain optimization: a better balance between efficiency and fairness,” Future Technology, vol. 5, no. 2, pp. 281–296, May 2026, doi: https://doi.org/10.55670/fpll.futech.5.2.26.

[58] V. K. R. Mittamidi, “An Automated AI-Driven Monitoring and Observability Framework for Cloud-Based Data Pipelines by Software Defect Prediction Research,” International Journal of Multidisciplinary Evolutionary Research, vol. 5, no. 1, pp. 109–112, 2024, doi: https://doi.org/10.54660/ijmer.2024.5.1.109-112.

[59] R. R. Thalakanti, “Enhancing Convergence in Fully Connected Neural Networks via Optimized Backpropagation,” 2025 2nd International Conference on Computing and Data Science (ICCDS), pp. 1–6, Jul. 2025, doi: https://doi.org/10.1109/iccds64403.2025.11209625.

[60] S. K. Gunda, “A Hybrid Deep Learning Model for Software Fault Prediction Using CNN, LSTM, and Dense Layers,” Communications in Computer and Information Science, pp. 282–290, Oct. 2025, doi: https://doi.org/10.1007/978-3-032-05144-8_21.

[61] T. Raikar, “Preserving the clean core principles in SAP systems: Design strategies for integrating AI,” 2026 International Conference on Electronic Systems and Intelligent Computing (ICESIC), pp. 1036–1041, Mar. 2026, doi: https://doi.org/10.1109/icesic67389.2026.11496501.

[62] I. Manga, S. D. Sivva, and V. K. Manga, “The Adaptive Intelligence in Cloud Systems: A Unified Architecture for AI Enhanced Observability and Automated Root Cause Analysis,” International Journal of Artificial Intelligence, Data Science, and Machine Learning, vol. 5, pp. 160–166, 2024, doi: https://doi.org/10.63282/3050-9262.ijaidsml-v5i1p115.

[63] S. K. Gunda, “Automatic Software Vulnerabilty Detection Using Code Metrics and Feature Extraction,” 2025 2nd International Conference On Multidisciplinary Research and Innovations in Engineering (MRIE), pp. 115–120, Jul. 2025, doi: https://doi.org/10.1109/mrie66930.2025.11156601.

[64] M. Ukey, S. R. Abbidi, T. K. Kota, T. Raikar, M. Mallepati, and P. J. Adinarayana, “Digital Transformation in Healthcare: Integrating Clinical Research with Data Management Technologies,” 2026 6th International Conference on Recent Trends in Computer Science and Technology (ICRTCST), pp. 886–891, Jan. 2026, doi: https://doi.org/10.1109/icrtcst68392.2026.11545210.

[65] A. K. K. Varma Alluri, “Governed Agentic AI for Salesforce CRM Platforms: A Reference Architecture for Data Grounding, Decision Intelligence, Trust Controls, and Lifecycle Reliability,” International Journal of Emerging Trends in Computer Science and Information Technology, vol. 7, pp. 374–382, 2026, doi: https://doi.org/10.63282/3050-9246.ijetcsit-v7i1p153.

AI-Based Test Case Generation for Continuous Integration Pipelines: Combining Large Language Models, Program Analysis, and Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Side