Int J Performability Eng ›› 2026, Vol. 22 ›› Issue (5): 288-296.doi: 10.23940/ijpe.26.05.p6.288296

Previous Articles    

Strategic Management of Hybrid Retrieval-Augmented Microservices for Long-Horizon Cloud Machine Learning

Deepak Bansala, Yojna Arorab, Hare Ram Singhc, Rashmi Sharmad, and Rekha Chaturvedie,*   

  1. aFinance Department, GNIOT Institute of Management Studies, Uttar Pradesh, India;
    bDepartment of Computer Science & Engineering, School of Engineering and Technology, Sharda University, Uttar Pradesh. India;
    cDepartment of Computer Science & Engineering, KCC Institute of Technology & Management, Uttar Pradesh. India;
    dDepartment of Computer Engineering, MPSTME, SVKM's NMIMS University, Maharashtra, India;
    eDepartment of Data Science and Engineering, School of Information Security and Data Science, Manipal University Jaipur, Rajasthan, India
  • Submitted on ; Revised on ; Accepted on
  • Contact: * E-mail address: rekha.chaturvedi@jaipur.manipal.edu

Abstract: For long-horizon machine learning, there is a need to process large amounts of contextual data and perform reasoning over long sequences of time or knowledge. The traditional monolithic machine learning architectures have shown limitations in terms of scalability, accessibility of knowledge, and efficient utilization of resources in cloud computing. The paper aims to introduce a Hybrid Retrieval-Augmented Microservices Architecture (HRAMA), which can be used to enhance the efficiency of machine learning architectures in cloud computing. The hybrid retrieval mechanism integrates semantic vector similarity search with metadata-based filtering to improve the relevance of the extracted information. The architecture is based on a machine learning pipeline that is decomposed into independent microservices, which are then deployed using containerized cloud computing. The performance of the proposed architecture is validated using experimental results that show improved retrieval accuracy, system throughput, and scalability, along with reduced inference latency. The proposed HRAMA framework is efficient for long horizon cloud machine learning applications.

Key words: hybrid retrieval, microservices architecture, retrieval-augmented learning, cloud machine learning, long-horizon reasoning, vector databases, Kubernetes, distributed machine learning