Int J Performability Eng ›› 2021, Vol. 17 ›› Issue (8): 703-710.doi: 10.23940/ijpe.21.08.p6.703710

Previous Articles     Next Articles

A Survey on Challenges in Transforming No-SQL Data to SQL Data and Storing in Cloud Storage based on User Requirement

S.P. Shantharajah and E. Maruthavani   

  1. School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
  • Submitted on ; Revised on ; Accepted on
  • Contact: * E-mail address: shantharajah.sp@vit.ac.in

Abstract: The Internet and modern computing have produced explosive growth in data volumes, but also new ways to store data. When storing petabytes of data for analysis and to gain new insight, related decisions are made about the storage and processing of big data. Organizations physically store their data on servers in their own data centers. Storing data on servers in the own data center is called on-premises or on-prem storage. In the big data world, there is a growing need in providing useful insights from different kinds of data and using data to infer vital information towards better decision making. The main challenge is to store such data with velocity into a commercial data warehouse. The effective solution for storing such data is to represent the unstructured data into structured data and maintain it using distributed clusters and cloud storage based on the user requirements. The structured data maintained in the cloud are cost efficient and infinitely scalable. To query these huge datasets from cloud storage, there is a need for distributed query engines like Hive, Impala, Presto, and Drill. These open-source Structured Query Language (SQL) engines are capable of querying enormous datasets almost instantaneously. The present work has an emphasis on distributed SQL engines like Hive and Impala that can query extremely large datasets. The focus is more on Hive and Impala which are the most widely deployed of these query engines. The outcome of the research helps the readers to understand the challenges being faced in providing and maintaining structured data in the cloud.

Key words: data warehouses, cloud storage, No-SQL, big data