sql-execution-plan

Hive explain plan understanding

时间秒杀一切 提交于 2020-06-21 10:30:09
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Hive explain plan understanding

人走茶凉 提交于 2020-06-21 10:30:00
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Why does MySQL not always use index for select query?

旧城冷巷雨未停 提交于 2020-06-01 06:23:00
问题 I have two tables in my database users and articles. Records in my users and articles table are given below: +----+--------+ | id | name | +----+--------+ | 1 | user1 | | 2 | user2 | | 3 | user3 | +----+--------+ +----+---------+----------+ | id | user_id | article | +----+---------+----------+ | 1 | 1 | article1 | | 2 | 1 | article2 | | 3 | 1 | article3 | | 4 | 2 | article4 | | 5 | 2 | article5 | | 6 | 3 | article6 | +----+---------+----------+ Given below the queries and the respected

Why does MySQL not always use index for select query?

旧巷老猫 提交于 2020-06-01 06:21:31
问题 I have two tables in my database users and articles. Records in my users and articles table are given below: +----+--------+ | id | name | +----+--------+ | 1 | user1 | | 2 | user2 | | 3 | user3 | +----+--------+ +----+---------+----------+ | id | user_id | article | +----+---------+----------+ | 1 | 1 | article1 | | 2 | 1 | article2 | | 3 | 1 | article3 | | 4 | 2 | article4 | | 5 | 2 | article5 | | 6 | 3 | article6 | +----+---------+----------+ Given below the queries and the respected

How to force evaluation of subquery before joining / pushing down to foreign server

断了今生、忘了曾经 提交于 2020-05-15 09:30:29
问题 Suppose I want to query a big table with a few WHERE filters. I am using Postgres 11 and a foreign table; foreign data wrapper (FDW) is clickhouse_fdw . But I am also interested in a general solution. I can do so as follows: SELECT id,c1,c2,c3 from big_table where id=3 and c1=2 My FDW is able to do the filtering on the remote foreign data source, ensuring that the above query is quick and doesn't pull down too much data. The above works the same if I write: SELECT id,c1,c2,c3 from big_table

PostgresSQL Nested Loops - When does the planner decide to use Nested Loop when doing an INNER JOIN?

一个人想着一个人 提交于 2020-01-30 08:14:23
问题 I am running a query with an INNER JOIN where the planner decides to use a Nested Loop. I've figured out that it has do with the WHERE conditions as I have tried writing the query with different WHERE conditions so it returns the same result but does not use a Nested Loop. My question is why has the planner decided to make the different decisions when the queries appear to be identical as they both return the same result? The query runs in 77 secs with the Nested Loop and in 13 sec without,

Parameterized SQL - in / not in with fixed numbers of parameters, for query plan cache optimization?

血红的双手。 提交于 2020-01-16 14:30:44
问题 If SQL is used directly or created by NHibernate, with possibly big "where in / not in ([1 to 100 parameters])" conditions, does it make sense to fill up parameters to certain limits, to have a limited number of query plans? Parameters are int/number, DBMS is MSSQL or Oracle. The queries are called via sp_executesql/executeimmediate to enforce query plan caching. Normally, such a query would have up to 100 query plans for the same query. Several such queries might quickly fill up the cache,

Table Valued Parameters with Estimated Number of Rows 1

依然范特西╮ 提交于 2020-01-15 05:35:25
问题 I have been searching the internet for hours trying to figure out how to improve the performance of my query using table-valued parameters (TVP). After hours of searching, I finally determined what I believe is the root of the problem. Upon examining the Estimated Execution plan of my query, I discovered that the estimated number of rows for my query is 1 anytime I use a TVP. If I exchange the TVP for a query that selects the data I am interested in, then the estimated number of rows is much

Why would the exact same SQL query result with a different execution plan when executed via the sp_executeSQL procedure?

好久不见. 提交于 2020-01-14 08:41:28
问题 As the title states, I don't understand why the sp_executeSQL would generate a completely different execution plan than running the query from Sql Management Studio. My query in question will take 3 seconds when run from SQL management Studio, where as the query run in management studio via sp_executeSQL will take 5 minutes. I've updated statistics, and reviewed indexes, but the fact remained in my head that the execution plan from sp_executeSQL was FAR worse than running the sql directly

How reliable is the cost measurement in PostgreSQL Explain Plan?

被刻印的时光 ゝ 提交于 2020-01-13 10:13:33
问题 The queries are performed on a large table with 11 million rows. I have already performed an ANALYZE on the table prior to the query executions. Query 1: SELECT * FROM accounts t1 LEFT OUTER JOIN accounts t2 ON (t1.account_no = t2.account_no AND t1.effective_date < t2.effective_date) WHERE t2.account_no IS NULL; Explain Analyze: Hash Anti Join (cost=480795.57..1201111.40 rows=7369854 width=292) (actual time=29619.499..115662.111 rows=1977871 loops=1) Hash Cond: ((t1.account_no)::text = (t2