• Skip to main content
  • Skip to primary sidebar
BMA

BeMyAficionado

Inspire Affection

Big Query: Everything You Need to Jump Start Your Development

July 23, 2018 by varunshrivastava Leave a Comment

Big Query is all about running a query on big data.

The term Big Query itself explains half the story. Just by reading the term gives you enough insights about the technology that it is related to Big Data and Running Query on the Data.

For small data you do not need massive architecture or computing power to fetch the information in considerable time. But when data grows beyond bars then traditional technologies doesn’t play well. A simple select query might take more than a day to produce result. This makes it impossible to fetch the useful information from the data.

That is where Google BigQuery comes into the picture.

Google BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google’s infrastructure.

Google Website

Table of Contents

  • BI Performance Benchmarks with Big Query From Google
    • Benchmark Data Set Used
    • Large Query Performance
    • Small Query Performance
    • Concurrent Query Performance
  • Big Query is “SERVER-LESS”
    • Advantages of Serverless Model
      • Cost
      • Scalability
      • Productivity
  • Does BigQuery Replace the current Technology Stack for Organization
    • Big Query will be used to Generate Intelligent Reports with Artificial Intelligence
  • How to Get Started with Google BigQuery
  • Conclusion

BI Performance Benchmarks with Big Query From Google

I want to talk about the performance of Google Big Query because performance is the the only thing that matters while querying massive data sets.

I personally do not have that massive data sets to benchmark Big Query. So, I’m going to share the insights of the results shared by atscale. But if you want to read the detailed benchmark then click on the below link:

  • BI Performance Benchmark for Big Query

Benchmark Data Set Used

The schema and the rows count is given in the image below:

BI on Big Query Benchmark Schema and Row Counts

As you can see the size of data in each table is huge. 

Customer Table have more than 1 billion rows and LineOrder table has more than 5.8 billion rows. That is a huge data set. And with traditional technology it would take a lot of time and will cost a lot of money to fetch the information in considerable time. But Big Query does it in record time.

Let’s take a look at the benchmark queries:

Large Query Performance

There was not much of a difference in the performance of Big Query in comparison with other technologies, infact, BQ was slow in most of the cases.

The results in the above chart were achieved with no additional query tuning.

Small Query Performance

Again, for small queries as well, Big Query did not stand apart, it was close.

Small Query Performance

But wait for it.

Concurrent Query Performance

This is where you run multiple queries at the same time. And Big Query easily excelled and out performed other technologies.

Take a look for yourself:

Concurrent Query Performance

Google BigQuery “serverless” model means that concurrent query response time profiles remained effectively flat, even when they went past the 25 concurrent user mark.

Now, that is impressive.

Big Query concurrently queries large data sets in record time.

Big Query is “SERVER-LESS”

BigQuery runs on a SERVER-LESS computing execution model. In this model, the cloud provider dynamically manages the allocation of required resources to serve the user query.

There is no need to pay for the idle data storage. You only pay for the amount of resources consumed by the application. No need of pre-purchasing the units of storage or resources.

In actual, serverless does not mean the absence of server. It still requires server. It is a misnomer in that sense.

The name serverless computing is used because the configuration part of the server is completely hidden from the developer or application point of view. All you should care about is sending a request and getting the required data. All the complex resource allocation part will be taken care by the service itself.

Advantages of Serverless Model

The main advantages of a serverless model are Cost, Scalability and  Productivity.

Cost

As you do not have to put your money to acquire any asset, so you are not charged for idle time. You only pay for the amout of resources you use. 

Suppose you have a massive amount of data that is stored on the BigQuery. And you process that data at the end of every month to generate reports. So all that time your data is kept at BigQuery data centers and you will not pay a dime for that.

You are only going to pay for the processing and the amount of data you are processing.

This is so much more efficient model from the analytics point of view where you are only going to use the data periodically to generate reports.

On top of that you are getting data storage for free. You pay nothing and your data is kept secured on the google data centers.

Other Immediate cost benefits are related to the lack of operating systems costs, including: licences, installation, dependencies, maintenance, support, and patching.

Scalability

The main concern for any organization is its Scalability.

As the firm grows the data grows along with it. Or you can say as the data grows, the firm needs to grow to complement the data. And this is one of the biggest concerns of the IT World.

With BigQuery you do not have to worry about the scalability. All the configuration, resources management is taken care by the Google engineers on their premises. As Google puts it: ‘from prototype to production to planet-scale.‘

Separation of Compute and State makes BQ Scalable
Separation of Compute and State makes BQ Scalable

You only care to use their service to retrieve your data. Finally, developers lives are at ease as they only have to care about retrieving the data and not its structure, performance or efficiency.

Productivity

I do not have INTRANET

The developers are resource for productivity in any organization. And if developer will spend its useful time in developing rather than caring about the overall configuration and performance, then the productivity of the developer goes down.

With BigQuery Function as a service, the units of code exposed to the outside world are simple functions.

The developers are only concerned about accessing those functions and all the complex part is taken care behind the scene.

This greatly simplifies the task of a software developers and does increases the productivity of the organization.

Does BigQuery Replace the current Technology Stack for Organization

BigQuery is a new technology built to deal with massive datasets. It is to be looked as an addition to the existing technology stack in oppose to the replacement.

The basic data production mechanism and storing mechanism is not going to change in the recent future. The data will be produced in the same way with some minor modifications. But the analytics part of these huge data sets might be managed by BigQuery or other similar technologies.

As data grows, it becomes hard for the existing/traditional technology stack to process it to fetch the useful information, that is where BigQuery will prevail in the future.

Big Query will be used to Generate Intelligent Reports with Artificial Intelligence

With the rise in the Artificial Intelligence and Chatbots. The more data an organization have, the better support it can provide. And to provide better support you need hardware + software power to analyse large data sets to find the relevant information. This is where BigQuery comes into the picture.

Generate Intelligent reports
Generate Intelligent reports

BigQuery processes billions of rows in less than a few seconds to provide the required information. That is amazing technology.

Let me explain you a scenario with a live example:

Suppose your organization deals with some kind of transactions. Now, you are a popular organization with a large user base. You generate massive amounts of transaction data each day.

Now, you go to a tech guy (me) and ask him to build you a chatbot that would answer most frequently asked user queries.

Queries could be anything ranging from your last transaction, to your monthly transaction to your yearly transactions.

Now, the information user is seeking for could be very small, like – How many transactions took place in the last quarter?

But to process such simple query, the computer will have to process billions of rows in the backend. And trust me your traditional mysql database would take a day’s time to provide that information.

The only efficient solution that you could turn towards is Google BigQuery. It processes billions of rows with complex filter logic in less than half a minute.

So your chatbot can provide an accurate answer in a record time. And to further increase the performance, BigQuery has inbuilt mechanism to cache similar queries. So, the next time user asks for a similar data, it would get a response a lot faster.

BigQuery is not here to replace your current technology stack but to compliment it by keeping all the heavy processing and costly hardware maintenance to itself.

How to Get Started with Google BigQuery

You will get everything you need to get started with BigQuery for your project in the following link: Google BigQuery Quickstart.

If you want to try out BigQuery on public data sets than BigQuery offers their public console to test BigQuery and decide for yourself whether you want it for your project or not.

Here is a small demonstration of the query on massive data set. Following is the information of the table:

table contains 3.91 GB of data
table contains 3.91 GB of data
Simple query to fetch the count of stories grouped by author
Simple query to fetch the count of stories grouped by author
  • Try out BigQuery on public data sets

Conclusion

BigQuery is providing an affordable solution to the organizations to manage and process their data to fetch useful insights.

BigQuery has a lot of real world applications, mostly in terms of analytics and data processing. But in the future, I could see BigQuery as an integral part of every organization (Big or small). 

BigQuery is here to stay for a long-long time.

Please share your views on BigQuery and how it can revolutionize the IT Industry.

Related

Filed Under: Technology Tagged With: Big Query, BigQuery Benchmark, Get started, google, Serverless Architecture

Primary Sidebar

Subscribe to Blog via Email

Do you enjoy the content? Feel free to leave your email with me to receive new content straight to your inbox. I'm an engineer, you can trust me :)

Join 874 other subscribers

Latest Podcasts

Recent Posts

  • Is The Cosmos a Vast Computation?
  • Building Semantic Search for E-commerce Using Product Embeddings and OpenSearch
  • Leader Election with ZooKeeper: Simplifying Distributed Systems Management
  • AWS Serverless Event Driven Data Ingestion from Multiple and Diverse Sources
  • A Step-by-Step Guide to Deploy a Static Website with CloudFront and S3 Using CDK Behind A Custom Domain

Recent Comments

  • Varun Shrivastava on Deploy Lambda Function and API Gateway With Terraform
  • Vaibhav Shrivastava on Deploy Lambda Function and API Gateway With Terraform
  • Varun Shrivastava on Should Girls Wear Short Clothes?
  • D on Should Girls Wear Short Clothes?
  • disqus_X5PikVsRAg on Basic Calculator Leetcode Problem Using Object-Oriented Programming In Java

Categories

  • Blogging
  • Cooking
  • Fashion
  • Finance & Money
  • Programming
  • Reviews
  • Software Quality Assurance
  • Technology
  • Travelling
  • Tutorials
  • Web Hosting
  • Wordpress N SEO

Archives

  • November 2024
  • September 2024
  • July 2024
  • April 2024
  • February 2024
  • November 2023
  • June 2023
  • May 2023
  • April 2023
  • August 2022
  • May 2022
  • April 2022
  • February 2022
  • January 2022
  • November 2021
  • September 2021
  • August 2021
  • June 2021
  • May 2021
  • April 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • February 2020
  • December 2019
  • November 2019
  • October 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • January 2019
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016

Tags

Affordable Hosting (4) algorithms (4) amazon (3) aoc-2020 (7) believe in yourself (4) best (4) database (4) earn money blogging (5) education (4) elementary sorting algorithms (4) experience (3) fashion (4) finance (6) Financial Freedom (7) food (7) friends (3) goals (5) google (5) india (10) indian cuisine (5) indian education system (4) java (16) life (16) life changing (4) love (4) make money (3) microservices (9) motivation (4) oops (4) podcast (6) poor education system (4) principles of microservices (5) problem-solving (7) programmer (5) programming (28) python (5) reality (3) seo (6) spring (3) success (10) success factor (4) technology (4) top 5 (7) typescript (3) wordpress (7)

Copyright © 2025 · Be My Aficionado · WordPress · Log in

Go to mobile version