暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

个帐平台数据库一条复杂sql的分析和优化

bestpaydata 2021-04-18
1052

案例分析总结:

遇到复杂sql执行缓慢的问题,在确认数据库层面无异常后,一般处理逻辑是对执行计划进行分析看是否有调优的空间。首先通过简化文本来分析sql的表连接关系和业务逻辑等,在充分了解sql的前提下再对执行计划进行分析并找到问题点,最后对症下药,通过语句等价改写、语句中加入hint、绑定执行计划、统计信息收集等等方式达到优化执行计划的目的


案例分析过程:

本次分享的是个帐平台数据库生产环境中一句复杂sql的分析和优化, sql的文本如下,看到这样的sql先不能头晕眼花哦,要通过一些方法进行简化然后着手分析:


select a.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG,

SUM(a.SUPPLY_FEE_AMT)

from (selecta.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS as AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG,

b.SUPPLY_FEE_AMT

from(select msg_type,

PROCESSING_CODE,

POS_COND_CODE,

PAN,

PAN_LEN,

AMT_TRANS,

RETRIVL_REF,

BAT_NO,

CUP_SSN,

DATE_SETTLMT,

CARD_ACCP_TERM_ID,

CARD_ACCP_ID,

TIME_LOCAL_TRANS,

DATE_LOCAL_TRANS,

REVSAL_FLAG,

CANCEL_FLAG,

txn_num,

resp_code,

trans_state

from xx_posp_txn_his a

where a.trans_state = '1'

and a.DATE_SETTLMT = '20151118'

and a.REVSAL_FLAG = '0'

and a.CANCEL_FLAG = '0'

and a.processing_code in ('000000', '990000')

and a.resp_code = '00'

and a.txn_num in ('1105', '1125')

and a.CARD_ACCP_ID in

(select m_org_code

from xx_UMS_MERCHANT_REL_INFO

whereF_UMS_CODE = '100600'

andm_org_code <> '111999949000000'

andm_org_code not in

(select mer_code

from T_INFO_UNIONPAY_MERCHANT

where agent_code = '100600'))) a

join(select SUPPLY_FEE_AMT, TERMINAL_COMM_SEQ_NO

from xx_log_offline_payment_his

wheresupply_org_code in

(select m_org_code

fromT_UMS_MERCHANT_REL_INFO

whereF_UMS_CODE = '100600'

andm_org_code <> '111999949000000'

and m_org_code not in

(select mer_code

from xx_INFO_UNIONPAY_MERCHANT

where agent_code = '100600'))

and settle_date = '20151118'

union all

select SUPPLY_FEE_AMT, TERMINAL_COMM_SEQ_NO

from xx_log_online_payment_his

where supply_org_code in

(select m_org_code

fromT_UMS_MERCHANT_REL_INFO

where F_UMS_CODE ='100600'

andm_org_code <> '111999949000000'

andm_org_code not in

(select mer_code

from XX_INFO_UNIONPAY_MERCHANT

where agent_code = '100600'))

and settle_date = '20151118') b

ona.RETRIVL_REF = b.TERMINAL_COMM_SEQ_NO) a

group bya.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG

这句sql语句美化后也有120行样子了,算是比较繁杂的语句了,这该怎么入手分析呢,我一般先把无关紧要的那些处于select和from之间的输出列去掉,然后将不涉及表关联的谓词和非索引列谓词去掉,这里还可以把三处标红的相同子查询简化掉,另外,原语句中表的别名重复使用了,这会使得sql可读性变差,也顺便修改一下,经过简化后本文就变得很简单了:

select cc.XXX,sum(cc.SUPPLY_FEE_AMT)

from (selectaa.XXX,bb.XXX

from

(selecta.XXX

from t_posp_txn_his a

where a.DATE_SETTLMT = '20151118'

anda.CARD_ACCP_ID in 子查询) aa

join (select XXX

from XX_log_offline_payment_his

where supply_org_code in 子查询

and settle_date = '20151118'

union all

select XXX

from xx_log_online_payment_his

wheresupply_org_code in 子查询

and settle_date ='20151118') bb

onaa.RETRIVL_REF = bb.TERMINAL_COMM_SEQ_NO) cc

group by cc.XXXX……;

简化后,很容易就能看懂这句sql的意思,我用不同颜色将主体区分成三部分,后面对执行计划的分析中也用相同的颜色对应。

三部分都涉及到的相同子查询,单独运行一下发现数据量比较少,只有671条数据:

selectm_org_code

from XX_UMS_MERCHANT_REL_INFO

where F_UMS_CODE ='100600'

and m_org_code<> '111999949000000'

and m_org_code not in

(select mer_code

from XX_INFO_UNIONPAY_MERCHANT

whereagent_code = '100600')

在数据库中运行的这条sql是有性能问题的,平均执行一次需要748S:



我们来看看执行计划:

看下蓝色的部分,XX_LOG_ONLINE_PAYMENT_HIS表是走了SETTLEDATE列的索引

这部分过滤出来3378128多条数据然后和子查询做hash连接显然这个不好

select count(*) from xx_log_online_payment_his where settle_date = '20151118'

--3378128

前面说到那个子查询结果只有671条数据所以用子查询做驱动表supply_org_code列上走嵌套连接会比较好XX_LOG_ONLINE_PAYMENT_HIS表的supply_org_code列有索引的

再看绿色部分先通过XX_LOG_OFFLINE_PAYMENT_HIS表的SETTLEDATE列索引过滤了数据基本没有数据)然后和XX_UMS_MERCHANT_REL_INFO 做嵌套最后和T_INFO_UNIONPAY_MERCHANTHASH

最后看下紫色部分XX_posp_txn_his表是走了DATE_SETTLMT列的索引来过滤数据过滤效率一般,CARD_ACCP_ID列上没有索引

select count(*) from XX_posp_txn_his where DATE_SETTLMT ='20151118'--256582

看来问题都有子查询展开后的表连接顺序有关

那我们可以考虑,将三个相同的子查询独立成一个临时表,一来简化了sql文本,二来结构更清晰,三来将子查询作为一个整体视图使得连接关系更清楚

改写后语句如下:

withtmp_m_org_code as (

select m_org_code

from XX_UMS_MERCHANT_REL_INFO

where F_UMS_CODE = '100600'

and m_org_code<> '111999949000000'

and m_org_code not in

(select mer_code

from XX_INFO_UNIONPAY_MERCHANT

whereagent_code = '100600'))

select a.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG,

SUM(a.SUPPLY_FEE_AMT)

from (selecta.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS as AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG,

b.SUPPLY_FEE_AMT

from(select msg_type,

PROCESSING_CODE,

POS_COND_CODE,

PAN,

PAN_LEN,

AMT_TRANS,

RETRIVL_REF,

BAT_NO,

CUP_SSN,

DATE_SETTLMT,

CARD_ACCP_TERM_ID,

CARD_ACCP_ID,

TIME_LOCAL_TRANS,

DATE_LOCAL_TRANS,

REVSAL_FLAG,

CANCEL_FLAG,

txn_num,

resp_code,

trans_state

from xx_posp_txn_his a,tmp_m_org_code mm

where a.trans_state = '1'

and a.DATE_SETTLMT = '20151118'

and a.REVSAL_FLAG = '0'

and a.CANCEL_FLAG = '0'

and a.processing_code in ('000000', '990000')

and a.resp_code = '00'

and a.txn_num in ('1105', '1125')

and a.CARD_ACCP_ID=mm.m_org_code) a

join(select off.SUPPLY_FEE_AMT, off.TERMINAL_COMM_SEQ_NO

from xx_log_offline_payment_his off,tmp_m_org_codemm

where off.supply_org_code =mm.m_org_code

and off.settle_date = '20151118'

union all

select onl.SUPPLY_FEE_AMT, onl.TERMINAL_COMM_SEQ_NO

from xx_log_online_payment_his onl,tmp_m_org_codemm

where onl.supply_org_code=mm.m_org_code

and onl.settle_date = '20151118')b

ona.RETRIVL_REF = b.TERMINAL_COMM_SEQ_NO) a

group bya.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG;

来看一下新的执行计划,执行计划更加清晰了,表的访问也更合理了,20s内可以出结果

2jhu1wkw42gjd


运行后发现执行计划会突变成走XX_LOG_OFFLINE_PAYMENT_HIS表的SETTLEDATE列索引



我们发现,走SETTLEDATE列上的索引性能就会恶化变,注意这里显示和cardinality feedback有关看来这里这个特性帮了倒忙


接下来,我们可以绑定执行计划或者是使用hint的方法来强制走plan_hash_value3685817487的执行计划当然建议绑定计划,方便省事,如果要使用hint的话,那么又要改动一下语句了,我尝试用hint关闭cardinality feedback特性后运行正常,

hint关闭CF特性的语句如下:


withtmp_m_org_code as (

selectm_org_code

from XX_UMS_MERCHANT_REL_INFO

where F_UMS_CODE ='100600'

and m_org_code<> '111999949000000'

and m_org_code not in

(select mer_code

from XX_INFO_UNIONPAY_MERCHANT

whereagent_code = '100600'))

select a.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG,

SUM(a.SUPPLY_FEE_AMT)

from (selecta.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS as AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG,

b.SUPPLY_FEE_AMT

from(select msg_type,

PROCESSING_CODE,

POS_COND_CODE,

PAN,

PAN_LEN,

AMT_TRANS,

RETRIVL_REF,

BAT_NO,

CUP_SSN,

DATE_SETTLMT,

CARD_ACCP_TERM_ID,

CARD_ACCP_ID,

TIME_LOCAL_TRANS,

DATE_LOCAL_TRANS,

REVSAL_FLAG,

CANCEL_FLAG,

txn_num,

resp_code,

trans_state

from x_posp_txn_his a,tmp_m_org_code mm

where a.trans_state = '1'

and a.DATE_SETTLMT = '20151118'

and a.REVSAL_FLAG = '0'

and a.CANCEL_FLAG = '0'

and a.processing_code in ('000000', '990000')

and a.resp_code = '00'

and a.txn_num in ('1105', '1125')

and a.CARD_ACCP_ID=mm.m_org_code) a

join(select off.SUPPLY_FEE_AMT, off.TERMINAL_COMM_SEQ_NO

from XX_log_offline_payment_his off,tmp_m_org_codemm

where off.supply_org_code =mm.m_org_code

and off.settle_date = '20151118'

union all

select *+opt_param('_optimizer_use_feedback' 'false')*/ onl.SUPPLY_FEE_AMT, onl.TERMINAL_COMM_SEQ_NO

from x_log_online_payment_his onl,tmp_m_org_codemm

where onl.supply_org_code=mm.m_org_code

and onl.settle_date = '20151118') b

ona.RETRIVL_REF = b.TERMINAL_COMM_SEQ_NO) a

group bya.msg_type,

a.PROCESSING_CODE,

a.POS_COND_CODE,

a.PAN,

a.PAN_LEN,

a.AMT_TRANS,

a.RETRIVL_REF,

a.BAT_NO,

a.CUP_SSN,

a.DATE_SETTLMT,

a.CARD_ACCP_TERM_ID,

a.CARD_ACCP_ID,

a.TIME_LOCAL_TRANS,

a.DATE_LOCAL_TRANS,

a.REVSAL_FLAG,

a.CANCEL_FLAG;


文章转载自bestpaydata,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论