sql – 为什么以下连接会显着增加查询时间?
发布时间:2021-01-27 18:33:41 所属栏目:MsSql教程 来源:网络整理
导读:我在这里有一个星型模式,我正在查询事实表,并希望加入一个非常小的维度表.我无法解释以下内容: EXPLAIN ANALYZE SELECT COUNT(impression_id),imp.os_id FROM bi.impressions imp GROUP BY imp.os_id; QUERY PLAN -----------------------------------------
我在这里有一个星型模式,我正在查询事实表,并希望加入一个非常小的维度表.我无法解释以下内容: EXPLAIN ANALYZE SELECT COUNT(impression_id),imp.os_id FROM bi.impressions imp GROUP BY imp.os_id; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=868719.08..868719.24 rows=16 width=10) (actual time=12559.462..12559.466 rows=26 loops=1) -> Seq Scan on impressions imp (cost=0.00..690306.72 rows=35682472 width=10) (actual time=0.009..3030.093 rows=35682474 loops=1) Total runtime: 12559.523 ms (3 rows) 这需要大约12600毫秒,但当然没有连接数据,所以我无法将imp.os_id“解析”为有意义的东西,所以我添加了一个连接: EXPLAIN ANALYZE SELECT COUNT(impression_id),imp.os_id,os.os_desc FROM bi.impressions imp,bi.os_desc os WHERE imp.os_id=os.os_id GROUP BY imp.os_id,os.os_desc; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=1448560.83..1448564.99 rows=416 width=22) (actual time=25565.124..25565.127 rows=26 loops=1) -> Hash Join (cost=1.58..1180942.29 rows=35682472 width=22) (actual time=0.046..15157.684 rows=35682474 loops=1) Hash Cond: (imp.os_id = os.os_id) -> Seq Scan on impressions imp (cost=0.00..690306.72 rows=35682472 width=10) (actual time=0.007..3705.647 rows=35682474 loops=1) -> Hash (cost=1.26..1.26 rows=26 width=14) (actual time=0.028..0.028 rows=26 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 2kB -> Seq Scan on os_desc os (cost=0.00..1.26 rows=26 width=14) (actual time=0.003..0.010 rows=26 loops=1) Total runtime: 25565.199 ms (8 rows) 这有效地使我的查询的执行时间加倍.我的问题是,我从画面中遗漏了什么?我认为这么小的查找不会导致查询执行时间的巨大差异. 解决方法用(推荐)显式ANSI JOIN语法重写:SELECT COUNT(impression_id),os.os_desc FROM bi.impressions imp JOIN bi.os_desc os ON os.os_id = imp.os_id GROUP BY imp.os_id,os.os_desc; 首先,如果在os_desc中为展示中的每一行找到多于或少于一个匹配项,则第二个查询可能是错误的. SELECT COUNT(*) AS ct,os.os_desc FROM bi.impressions imp JOIN bi.os_desc os USING (os_id) GROUP BY imp.os_id,os.os_desc; count(*)略快于count(列).并为计数添加列别名. SELECT os_id,os.os_desc,sub.ct FROM ( SELECT os_id,COUNT(*) AS ct FROM bi.impressions GROUP BY 1 ) sub JOIN bi.os_desc os USING (os_id) 先分组,稍后加入.更多细节在这里: (编辑:温州站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |