[1] Lewis P., Perez E., Piktus A., Petroni F., Karpukhin V., Goyal N., Küttler H., Lewis M., Yih W.T., Rocktäschel T., andRiedel S., 2020. Retrieval-augmented generation for knowledge-intensive NLP tasks.Advances in Neural Information Processing Systems, 33, pp. 9459-9474. [2] Wu C., Peng Q., Xia Y., Jin Y., andHu Z., 2023. Towards cost-effective and robust AI microservice deployment in edge computing environments.Future Generation Computer Systems, 141, pp. 129-142. [3] Klesel M., andWittmann H.F., 2025. Retrieval-augmented generation (rag) m. klesel, hf wittmann. Business & Information Systems Engineering,67(4), pp. 551-561. [4] Brown A., Roman M., andDevereux B., 2025. A systematic literature review of retrieval-augmented generation: techniques, metrics, and challenges.Arxiv Preprint Arxiv:2508.06401. [5] Li Z., Wang Z., Wang W., Hung K., Xie H., andWang F.L., 2025. Retrieval-augmented generation for educational application: A systematic survey.Computers and Education: Artificial Intelligence, 8, 100417. [6] Aksakalli I.K., Çelik T., Can A.B., andTekinerdoğan B., 2021. Deployment and communication patterns in microservice architectures: A systematic literature review.Journal of Systems and Software, 180, 111014. [7] Li H., Rao W., Hu B., Tian Y., andShen J., 2025. Energy-aware elastic scaling algorithm for microservices in kubernetes clouds.Journal of Network and Computer Applications, 242, 104218. [8] Seo J., Jang S., Cha J., Choi H., Kim D., andKim S., 2023. MDED-framework: a distributed microservice deep-learning framework for object detection in edge computing.Sensors, 23(10), 4712. [9] Ferreira R.C., Trapmann R., andvan den Heuvel W.J., 2025. MLOps with microservices: A case study on the maritime domain. InSymposium and Summer School on Service-Oriented Computing, pp. 3-15. [10] Ahmad H., Treude C., Wagner M., andSzabo C., 2025. Resilient auto-scaling of microservice architectures with efficient resource management.Arxiv Preprint Arxiv:2506.05693. [11] Hossen M.R., Islam M.A., andAhmed K., 2022. Practical efficient microservice autoscaling with QoS assurance. InProceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, pp. 240-252. [12] Malhotra F.Y.S.,2020. A multi-cloud orchestration model using kubernetes for microservices. [13] Devarakonda R.R.,2017. A microservices-based approach for scalable deployment of machine learning models on a cloud-based platform.Available at SSRN 5234707. [14] Karpukhin V., Oguz B., Min S., Lewis P., Wu L., Edunov S., Chen D., andYih W.T., 2020. Dense passage retrieval for open-domain question answering. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6769-6781. [15] Johnson J., Douze M., andJégou H., 2019. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data,7(3), pp. 535-547. [16] Burns B., Grant B., Oppenheimer D., Brewer E., andWilkes J., 2016. Borg, omega, and kubernetes. Communications of the ACM,59(5), pp. 50-57. [17] Dragoni N., Giallorenzo S., Lafuente A.L., Mazzara M., Montesi F., Mustafin R., andSafina L., 2017. Microservices: yesterday, today, and tomorrow.Present and Ulterior Software Engineering, pp. 195-216. [18] Zaharia M., Chen A., Davidson A., Ghodsi A., Hong S.A., Konwinski A., Murching S., Nykodym T., Ogilvie P., Parkhe M., andXie F., 2018. Accelerating the machine learning lifecycle with MLflow. IEEE Data Eng. Bull.,41(4), pp. 39-45. [19] Trabelsi I., Mahmoudi B., Minani J.B., Moha N., andGuéhéneuc Y.G., 2025. A systematic literature review of machine learning approaches for migrating monolithic systems to microservices.IEEE Transactions on Software Engineering. [20] Amugongo L.M., Mascheroni P., Brooks S., Doering S., andSeidel J., 2025. Retrieval augmented generation for large language models in healthcare: A systematic review.PLOS Digital Health, 4(6), e0000877. |