当前位置 博文首页 > Enmotech的博客:SQL优化:紧急情况下提高SQL性能竟是这样实现的
关注我们获得更多精彩
作者 | 黄堋 ,多年一线 Oracle DBA 经验,长期服务电信、电网、医院、政府等行业客户。擅长数据库优化、数据库迁移升级、数据库故障处理。
?在某运营商的优化经历中曾经遇到了一条比较有意思的 SQL,具体如下:
SQL> SELECT
??2 ???NVL(T.RELA_OFFER_SPEC_ID, SUBOS.SUB_OFFER_SPEC_ID) "offerSpecId"
??3 ???FROM OFFER_SPEC_RELA T
??4 ???LEFT JOIN OFFER_SPEC_GRP_RELA SUBOS
??5 ???ON T.RELA_GRP_ID ????= SUBOS.OFFER_SPEC_GRP_ID
??6 ???AND subos.start_dt ?<= SYSDATE
??7 ???AND subos.end_dt ???>= SYSDATE
??8 ???WHERE T.RELA_TYPE_CD = 2
??9 ???AND t.start_dt ?????<= SYSDATE
?10 ???AND t.end_dt ???????>= SYSDATE
?11 ???AND (T.OFFER_SPEC_ID = 109910000618
?12 ???OR EXISTS
?13 ?????(SELECT A.OFFER_SPEC_GRP_ID
?14 ?????FROM OFFER_SPEC_GRP_RELA A
?15 ?????WHERE A.SUB_OFFER_SPEC_ID = 109910000618
?16 ?????AND T.OFFER_SPEC_GRP_ID ??= A.OFFER_SPEC_GRP_ID
?17 ?????))
?18 ???AND rownum<500;
no rows selected
?
Execution Plan
----------------------------------------------------------
Plan hash value: 1350156609
Predicate Information (identified by operation id):
---------------------------------------------------
???1 - filter(ROWNUM<500)
???2 - filter("T"."OFFER_SPEC_ID"=109910000618 OR ?EXISTS (SELECT 0 FROM
??????????????"SPEC"."OFFER_SPEC_GRP_RELA" "A" WHERE "A"."OFFER_SPEC_GRP_ID"=:B1 AND
??????????????"A"."SUB_OFFER_SPEC_ID"=109910000618))
???3 - access("T"."RELA_GRP_ID"="SUBOS"."OFFER_SPEC_GRP_ID"(+))
???4 - filter("T"."RELA_TYPE_CD"=2 AND "T"."END_DT">=SYSDATE@! AND
??????????????"T"."START_DT"<=SYSDATE@!)
???5 - filter("SUBOS"."END_DT"(+)>=SYSDATE@! AND "SUBOS"."START_DT"(+)<=SYSDATE@!)
???6 - access("A"."SUB_OFFER_SPEC_ID"=109910000618 AND "A"."OFFER_SPEC_GRP_ID"=:B1)
?
Statistics
----------------------------------------------------------
??????????0 ?recursive calls
??????????0 ?db block gets
??????12444 ?consistent gets
??????????0 ?physical reads
??????????0 ?redo size
????????339 ?bytes sent via SQL*Net to client
????????509 ?bytes received via SQL*Net from client
??????????1 ?SQL*Net roundtrips to/from client
??????????0 ?sorts (memory)
??????????0 ?sorts (disk)
??????????0 ?rows processed
?
??????????????????PLAN ????????????????????GET ????DISK ???WRITE ?????????????ROWS ?????ROWS USER_IO(MS) ?ELA(MS) ?CPU(MS) CLUSTER(MS) ???PLSQL
END_TI I ???HASH VALUE EXEC ??????????PRE EXEC PRE EXEC PER EXEC ROW_P ???PRE EXEC PRE FETCH ???PER EXEC PRE EXEC PRE EXEC ???PER EXEC PER EXEC
Duang~
此处插播个小福利
本周四有一场你不得不看的讲座:由张甦老师开讲,带你走进不一样的 MySQL,为你讲解传统企业真正的痛点,帮你成为一名薪资在15k-20k的MySQL DBA工作者。
心动了没,扫描上方二维码把握薪资翻倍的机会,周四 20:00——21:00 不见不散!
此时应该有以下个地方值得注意
1) 该 sql 每天执行上千次,平均每次执行返回不到 10 行数据,但是平均逻辑读达到1.2W,可能存在性能问题。
2)ID 为 4,5 的执行计划路径中出现了两个全表扫描,看到这儿我们可以想到可能是没有合适的索引导致走了全表扫描从而执行效率低下。
3)ID 为 2 的执行计划路径出现了 FILTER,且? 3,和 6 为其子路径,如果FILTER有两个及两个以上的子路径,那么他的执行原理将类似于嵌套循环,id 号最小的子路径如果返回行数较多,可能会导致多次执行id号更小的子路径,导致性能低下。一般存在 “OR EXISTS” 的时候会出现此情况,可以根据情况避免。
4)存在条件“ rownum<500 ”,但是从历史的执行情况来看,返回行数都远小于 500 行,此处我们先予以忽略。
?
1)进过探查,发现存在两个表都有可用的索引,且两个表都只有几十 M 的大小。
2)去掉“OR EXISTS”子句查看执行效率
此处可用看到,去掉 “OR EXISTS” 之后两个表走了合适的索引,并且执行效率极高。
SQL> SELECT
??2 ???NVL(T.RELA_OFFER_SPEC_ID, SUBOS.SUB_OFFER_SPEC_ID) "offerSpecId"
??3 ???FROM OFFER_SPEC_RELA T
??4 ???LEFT JOIN OFFER_SPEC_GRP_RELA SUBOS
??5 ???ON T.RELA_GRP_ID ????= SUBOS.OFFER_SPEC_GRP_ID
??6 ???AND subos.start_dt ?<= SYSDATE
??7 ???AND subos.end_dt ???>= SYSDATE
??8 ???WHERE T.RELA_TYPE_CD = 2
??9 ???AND t.start_dt ?????<= SYSDATE
?10 ???AND t.end_dt ???????>= SYSDATE
?11 ???AND T.OFFER_SPEC_ID = 109910000618;
Elapsed: 00:00:00.00
?
Execution Plan
----------------------------------------------------------
Plan hash value: 510876366
Predicate Information (identified by operation id):
---------------------------------------------------
???2 - filter("T"."RELA_TYPE_CD"=2 AND "T"."END_DT">=SYSDATE@!?
AND "T"."START_DT"<=SYSDATE@!)
???3 - access("T"."OFFER_SPEC_ID"=109910000618)
???4 - filter("SUBOS"."END_DT"(+)>=SYSDATE@! AND "SUBOS"."START_DT"(+)<=SYSDATE@!)
???5 - access("T"."RELA_GRP_ID"="SUBOS"."OFFER_SPEC_GRP_ID"(+))
?
Statistics
----------------------------------------------------------
??????????0 ?recursive calls
??????????0 ?db block gets
??????????4 ?consistent gets
??????????0 ?physical reads
??????????0 ?redo size
????????339 ?bytes sent via SQL*Net to client
????????510 ?bytes received via SQL*Net from client
??????????1 ?SQL*Net roundtrips to/from client
??????????0 ?sorts (memory)
??????????0 ?sorts (disk)
??????????0 ?rows processed
3)去掉 “OR EXISTS” 中的子句查看执行效率。?
此处可用看到 “ OR EXISTS ” 中的子句单独执行返回行数并不多,且效率依旧很快。
SQL> SELECT A.OFFER_SPEC_GRP_ID
??2 ???FROM OFFER_SPEC_GRP_RELA A
??3 ???WHERE A.SUB_OFFER_SPEC_ID = 109910000618;
?
OFFER_SPEC_GRP_ID
-----------------
????????100000048
????????109090086
Elapsed: 00:00:00.01
?
Execution Plan
----------------------------------------------------------
Plan hash value: 4223340843
Predicate Information (identified by operation id):
---------------------------------------------------
???1 - access("A"."SUB_OFFER_SPEC_ID"=109910000618)
?
Statistics
----------------------------------------------------------
??????????1 ?recursive calls
??????????0 ?db block gets
??????????4 ?consistent gets
??????????0 ?physical reads
??????????0 ?redo size
????????597 ?bytes sent via SQL*Net to client
????????521 ?bytes received via SQL*Net from client
??????????2 ?SQL*Net roundtrips to/from client
??????????0 ?sorts (memory)
??????????0 ?sorts (disk)
??????????2 ?rows processed
4)我们把该条 sql 语句分为 “OR EXISTS” 的子句和其他部分两块,到此我们可以看到,两块的执行效率都很高,但是合在一起就低了很多。在这种情况下,几乎可以确认,将该存在 “OR EXISTS” 的子句改写为 union 必将提升效率。
SQL> SELECT *
??2 ?FROM
??3 ???(SELECT NVL(T.RELA_OFFER_SPEC_ID, SUBOS.SUB_OFFER_SPEC_ID) "offerSpecId"
??4 ???FROM OFFER_SPEC_RELA T
??5 ???LEFT JOIN OFFER_SPEC_GRP_RELA SUBOS
??6 ???ON T.RELA_GRP_ID ????= SUBOS.OFFER_SPEC_GRP_ID
??7 ???AND subos.start_dt ?<= SYSDATE
??8 ???AND subos.end_dt ???>= SYSDATE
??9 ???WHERE T.RELA_TYPE_CD = 2
?10 ???AND t.start_dt ?????<= SYSDATE
?11 ???AND t.end_dt ???????>= SYSDATE
?12 ???AND T.OFFER_SPEC_ID ?= 109910000618
?13 ???UNION
?14 ???SELECT NVL(T.RELA_OFFER_SPEC_ID, SUBOS.SUB_OFFER_SPEC_ID) "offerSpecId"
?15 ???FROM OFFER_SPEC_RELA T
?16 ???LEFT JOIN OFFER_SPEC_GRP_RELA SUBOS
?17 ???ON T.RELA_GRP_ID ????= SUBOS.OFFER_SPEC_GRP_ID
?18 ???AND subos.start_dt ?<= SYSDATE
?19 ???AND subos.end_dt ???>= SYSDATE
?20 ???WHERE T.RELA_TYPE_CD = 2
?21 ???AND t.start_dt ?????<= SYSDATE
?22 ???AND t.end_dt ???????>= SYSDATE
?23 ???AND EXISTS
?24 ?????(SELECT A.OFFER_SPEC_GRP_ID
?25 ?????FROM OFFER_SPEC_GRP_RELA A
?26 ?????WHERE A.SUB_OFFER_SPEC_ID = 109910000618
?27 ?????AND T.OFFER_SPEC_GRP_ID ??= A.OFFER_SPEC_GRP_ID
?28 ?????)
?29 ???)
?30 ?WHERE rownum<500;
Elapsed: 00:00:00.01
?
Execution Plan
----------------------------------------------------------
Plan hash value: 3072450155
?Predicate Information (identified by operation id):
---------------------------------------------------
???1 - filter(ROWNUM<500)
???3 - filter(ROWNUM<500)
???6 - filter("T"."RELA_TYPE_CD"=2 AND "T"."END_DT">=SYSDATE@! AND "T"."START_DT"<=SYSDATE@!)
???7 - access("T"."OFFER_SPEC_ID"=109910000618)
???8 - filter("SUBOS"."END_DT"(+)>=SYSDATE@! AND "SUBOS"."START_DT"(+)<=SYSDATE@!)
???9 - access("T"."RELA_GRP_ID"="SUBOS"."OFFER_SPEC_GRP_ID"(+))
??13 - access("A"."SUB_OFFER_SPEC_ID"=109910000618)
??14 - filter("T"."RELA_TYPE_CD"=2 AND "T"."END_DT">=SYSDATE@! AND "T"."START_DT"<=SYSDATE@!)
??15 - access("T"."OFFER_SPEC_GRP_ID"="A"."OFFER_SPEC_GRP_ID")
???????filter("T"."OFFER_SPEC_GRP_ID" IS NOT NULL)
??16 - filter("SUBOS"."END_DT"(+)>=SYSDATE@! AND "SUBOS"."START_DT"(+)<=SYSDATE@!)
??17 - access("T"."RELA_GRP_ID"="SUBOS"."OFFER_SPEC_GRP_ID"(+))
Statistics
----------------------------------------------------------
??????????0 ?recursive calls
??????????0 ?db block gets
?????????11 ?consistent gets
??????????0 ?physical reads
??????????0 ?redo size
????????339 ?bytes sent via SQL*Net to client
????????510 ?bytes received via SQL*Net from client
??????????1 ?SQL*Net roundtrips to/from client
??????????2 ?sorts (memory)
??????????0 ?sorts (disk)
??????????0 ?rows processed
?
此处我们可以看到,改写之后逻辑读仅仅 11,较优化前提升了上千倍。到了此处,我们已经将 sql 优化到几乎最快的效率了。
?
??1)我们再来回顾一下最开始的执行计划路径。
我们可以看到 “ OR EXISTS ” 中的子句是在 ID 为 6 的路径才开始执行的,这儿有一个知识点即为一个 sql 中的子句,一般情况下默认会将其放到最后执行。
?
??2)ID 为 4 , 5 的执行计划路径中在有高效索引的情况下却出现了两个全表扫描,可以推断 CBO 可能没有正常评估执行的 cost。
?
??3)“OR EXISTS” 中的子句执行效率很快,返回行数并不多,我们可以考虑提升 CBO 将其提前执行,看能否影响 CBO 选择出更高效的执行计划。
??SQL> SELECT NVL(T.RELA_OFFER_SPEC_ID, SUBOS.SUB_OFFER_SPEC_ID) "offerSpecId"
??2 ?FROM OFFER_SPEC_RELA T
??3 ?LEFT JOIN OFFER_SPEC_GRP_RELA SUBOS
??4 ?ON T.RELA_GRP_ID ????= SUBOS.OFFER_SPEC_GRP_ID
??5 ?AND subos.start_dt ?<= SYSDATE
??6 ?AND subos.end_dt ???>= SYSDATE
??7 ?WHERE T.RELA_TYPE_CD = 2
??8 ?AND t.start_dt ?????<= SYSDATE
??9 ?AND t.end_dt ???????>= SYSDATE
?10 ?AND (T.OFFER_SPEC_ID = 109910000618
?11 ?OR EXISTS
?12 ???(SELECT /*+ push_subq */ ?A.OFFER_SPEC_GRP_ID
?13 ???FROM OFFER_SPEC_GRP_RELA A
?14 ???WHERE A.SUB_OFFER_SPEC_ID = 109910000618
?15 ???AND T.OFFER_SPEC_GRP_ID ??= A.OFFER_SPEC_GRP_ID
?16 ???))
?17 ?AND rownum<500;
Elapsed: 00:00:00.03
?
Execution Plan
----------------------------------------------------------