BigQuery performance: Is this correct?

南楼画角 提交于 2019-12-11 04:10:55

问题


Folks, I'm using BigQuery as a superfast database for my analytics queries, but I'm very disappointed with its performance.

Let me show you the numbers:

  • Just one Table at "from" clause
  • Select about 15 fields with group by each, about 5 fields with SUM()
  • Total table rows: 3.7 millions
  • Total rows returned: 830K

When I execute this query on BigQuery's console, it takes about 1 minute to process. Is this ok for you? I was expecting that it will return in about 2 seconds... If I execute this query on a columnar database, like Sybase IQ, it takes less than 2 seconds.


回答1:


Big Query is a highly scalable database, before being a "super fast" database. It's designed to process HUGE amount of data distributing the processing among several different machines using a technique named Dremel. Because it's designed to use several machines and parallel processing, you should expect to have super-scalability with a good performance.

For example: analyzing all the wikipedia revisions in 5-10 seconds isn't bad, is it? But even a much smaller table would take about the same time.

Sybase IQ is often installed in a single database and it doesn't use Dremel. That said, it's going to be faster than Big Query in many scenarios...as designed.

Cheers!




回答2:


Since you are returning 830k rows and BQ is always creating a temporary result table, the creation is more than a small result.

Have you turned on large results?

We are working in a shared environment and sometime loads ( table creation ) takes a while. Certainly the performance differ from a dedicated environment. You get your dedicated environment for 20K$ a month.



来源:https://stackoverflow.com/questions/24455494/bigquery-performance-is-this-correct

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!