Vol. 2 No. 1 (2022): Journal of AI-Assisted Scientific Discovery
Articles

Serverless AI: Building Scalable AI Applications without Infrastructure Overhead

Naresh Dulam
Vice President Sr Lead Software Engineer, JP Morgan Chase, USA
Babulal Shaik
Cloud Solutions Architect, Amazon Web Services, USA
Karthik Allam
Big Data Infrastructure Engineer, JP Morgan & Chase, USA
Cover

Published 04-05-2021

Keywords

  • Serverless AI,
  • scalable AI applications

How to Cite

[1]
Naresh Dulam, Babulal Shaik, and Karthik Allam, “Serverless AI: Building Scalable AI Applications without Infrastructure Overhead ”, Journal of AI-Assisted Scientific Discovery, vol. 2, no. 1, pp. 519–542, May 2021, Accessed: Jan. 07, 2025. [Online]. Available: https://scienceacadpress.com/index.php/jaasd/article/view/228

Abstract

Artificial intelligence (AI) has revolutionized industries, offering transformative capabilities in areas like healthcare, finance, retail, and beyond. Yet, building and scaling AI applications often come with the heavy burden of managing infrastructure, provisioning resources, and addressing scalability challenges. Serverless computing emerges as a game-changer, eliminating the need to manage servers while providing on-demand scalability and cost efficiency. This paradigm allows developers to focus solely on application logic and innovation, leaving infrastructure concerns behind. By combining serverless computing with AI, organizations can deploy intelligent, scalable applications faster and more economically. Serverless architectures operate on a pay-as-you-go model, ensuring that businesses only pay for the exact resources consumed during AI tasks like training, inference, or data processing. This approach significantly reduces operational costs while enabling effortless scaling for fluctuating workloads. Beyond cost benefits, serverless platforms simplify development, offering seamless integrations with machine learning tools, pre-trained models, and real-time data pipelines. Practical use cases span a wide range of industries—automating customer service with AI-powered chatbots, enabling dynamic personalization in e-commerce, streamlining fraud detection in finance, and driving innovation in predictive analytics. The serverless model democratizes access to cutting-edge AI technologies, making them accessible even to smaller organizations without extensive infrastructure budgets. Moreover, it allows larger enterprises to streamline operations, innovate faster, and enhance customer experiences without being constrained by infrastructure complexities. By leveraging serverless AI, developers and organizations can focus on solving real-world problems and delivering value, unburdened by the technicalities of server management. This convergence of serverless computing and AI not only simplifies the development lifecycle but also ensures that applications are resilient, scalable, and cost-effective. Ultimately, serverless AI empowers businesses to reimagine what’s possible, unlocking the full potential of intelligent applications while staying agile in an increasingly competitive and data-driven world.

Downloads

Download data is not yet available.

References

  1. Christidis, A., Moschoyiannis, S., Hsu, C. H., & Davies, R. (2020). Enabling serverless deployment of large-scale ai workloads. IEEE Access, 8, 70150-70161.
  2. Elger, P., & Shanaghy, E. (2020). AI as a Service: Serverless machine learning with AWS. Manning Publications.
  3. Christidis, A., Davies, R., & Moschoyiannis, S. (2019, November). Serving machine learning workloads in resource constrained environments: A serverless deployment example. In 2019 IEEE 12th Conference on Service-Oriented Computing and Applications (SOCA) (pp. 55-63). IEEE.
  4. Ilager, S., Muralidhar, R., & Buyya, R. (2020, October). Artificial intelligence (ai)-centric management of resources in modern distributed computing systems. In 2020 IEEE Cloud Summit (pp. 1-10). IEEE.
  5. Khatri, D., Khatri, S. K., & Mishra, D. (2020, June). Potential bottleneck and measuring performance of serverless computing: A literature study. In 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 161-164). IEEE.
  6. John, A., Ausmees, K., Muenzen, K., Kuhn, C., & Tan, A. (2019, December). Sweep: Accelerating scientific research through scalable serverless workflows. In Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing Companion (pp. 43-50).
  7. Pérez, A., Moltó, G., Caballer, M., & Calatrava, A. (2018). Serverless computing for container-based architectures. Future Generation Computer Systems, 83, 50-59.
  8. Sreekanti, V., Wu, C., Lin, X. C., Schleier-Smith, J., Faleiro, J. M., Gonzalez, J. E., ... & Tumanov, A. (2020). Cloudburst: Stateful functions-as-a-service. arXiv preprint arXiv:2001.04592.
  9. Patterson, S. (2019). Learn AWS Serverless Computing: A Beginner's Guide to Using AWS Lambda, Amazon API Gateway, and Services from Amazon Web Services. Packt Publishing Ltd.
  10. Wu, C., Sreekanti, V., & Hellerstein, J. M. (2020, June). Transactional causal consistency for serverless computing. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (pp. 83-97).
  11. Muhammad, T., Munir, M. T., Munir, M. Z., & Zafar, M. W. (2018). Elevating Business Operations: The Transformative Power of Cloud Computing. International Journal of Computer Science and Technology, 2(1), 1-21.
  12. Mukhi, N. K., Prabhu, S., & Slawson, B. (2017, December). Using a serverless framework for implementing a cognitive tutor: Experiences and issues. In Proceedings of the 2nd International Workshop on Serverless Computing (pp. 11-15).
  13. , A., Kejariwal, A., & Ramasamy, K. (2020, June). Le taureau: Deconstructing the serverless landscape & a look forward. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (pp. 2641-2650).
  14. Cicconetti, C., Conti, M., & Passarella, A. (2020). A decentralized framework for serverless edge computing in the internet of things. IEEE Transactions on Network and Service Management, 18(2), 2166-2180.
  15. Flores, H., Nurmi, P., & Hui, P. (2019, March). AI on the move: From on-device to on-multi-device. In 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops) (pp. 310-315). IEEE.
  16. Thumburu, S. K. R. (2020). Exploring the Impact of JSON and XML on EDI Data Formats. Innovative Computer Sciences Journal, 6(1).
  17. Thumburu, S. K. R. (2020). Large Scale Migrations: Lessons Learned from EDI Projects. Journal of Innovative Technologies, 3(1).
  18. Gade, K. R. (2020). Data Mesh Architecture: A Scalable and Resilient Approach to Data Management. Innovative Computer Sciences Journal, 6(1).
  19. Gade, K. R. (2020). Data Analytics: Data Privacy, Data Ethics, Data Monetization. MZ Computing Journal, 1(1).
  20. Katari, A. Conflict Resolution Strategies in Financial Data Replication Systems.
  21. Katari, A., & Rallabhandi, R. S. DELTA LAKE IN FINTECH: ENHANCING DATA LAKE RELIABILITY WITH ACID TRANSACTIONS.
  22. Komandla, V. Transforming Financial Interactions: Best Practices for Mobile Banking App Design and Functionality to Boost User Engagement and Satisfaction.
  23. Komandla, V. Enhancing Security and Fraud Prevention in Fintech: Comprehensive Strategies for Secure Online Account Opening.
  24. Gade, K. R. (2018). Real-Time Analytics: Challenges and Opportunities. Innovative Computer Sciences Journal, 4(1).