where条件中使用case when是什么鬼？

小灯数据 2021-06-08

20755

在我们在优化sql的过程中遇到开发同学写的sql把case when放在where条件里面，一般我们写sql时都会把case when放在select后面做一些判断取值、行列转换等，这个很好理解，但把case when放在where条件里就比较见了。

为模拟业务场景，我们先构造测试表：



#建测试表
drop table if exists t;
CREATE TABLE t (
                id int NOT NULL AUTO_INCREMENT PRIMARY KEY comment '自增主键', 
                dept tinyint not null comment '部门id',
                age tinyint not null comment '年龄',
                name varchar(30) comment '用户名称',
                create_time datetime not null comment '注册时间', 
                last_login_time datetime comment '最后登录时间'
               ) comment '测试表';
 
INSERT INTO t VALUES(1,1,25,'user_1','2018-01-01 00:00:00','2018-03-01 12:00:00');
INSERT INTO t VALUES(2,2,45,'user_2','2018-01-01 00:02:40','2018-01-04 02:37:46');
INSERT INTO t VALUES(3,2,26,'user_3','2018-01-01 00:02:09','2018-01-06 07:34:46');
INSERT INTO t VALUES(4,2,22,'user_4','2018-01-01 20:05:56','2018-01-01 20:48:21');
INSERT INTO t VALUES(5,2,23,'user_5','2018-01-01 00:05:55','2018-01-05 18:19:01');
INSERT INTO t VALUES(6,5,46,'user_6','2018-01-01 12:11:58','2018-01-09 23:31:44');
INSERT INTO t VALUES(7,5,31,'user_7','2018-01-01 00:11:15','2018-01-11 03:46:36');
INSERT INTO t VALUES(8,5,25,'user_8','2018-01-01 00:11:56','2018-01-12 01:16:02');
INSERT INTO t VALUES(9,5,49,'user_9','2018-01-01 00:03:09','2018-01-05 21:09:50');
INSERT INTO t VALUES(10,6,24,'user_10','2018-01-01 00:14:20','2018-01-03 03:30:44');
复制

表初始化后是长这样子的：

我们先看下开发同学写的sql：

select * from t 
where case when dept=2 
                and create_time>='2018-01-01 00:00:00' 
                and create_time<'2018-01-01 03:00:00' 
             then age<25 
           when dept=5 
                and create_time>='2018-01-01 00:00:00' 
                and create_time<'2018-01-01 03:00:00' 
             then age>45  
           else 1
      end;
复制

其执行结果为：

这个sql理解起来比较羞涩，甚致在Oracle和SQL Server里是不支持这样的这写法，我们就这个测试表分析下其执行过程。

因为这个表没有索引，因此只能走全表扫描（下面再分析这种写法能否用上索引）；

1、当遍历第一条记录时（id=1），dept=1，不符合两个case when条件，因此走到“else 1”，收集该记录进入结果集；

2、当遍历第二条记录时（id=2），记录的dept=2，create_time='2018-01-01 00:02:40'，符合第一个case when条件，因此进入第一个“then age<25”的判断发现不符合条件要求，因此丢弃该记录；

3、遍历第三条记录（id=3）情况与遍历第二条一样；

4、遍历第四条记录时（id=4），记录的create_time='2018-01-01 20:05:56'不符合case when的条件，因此走到“else 1”，收集该记录进入结果集；

5、遍历第五条记录时（id=５），记录的dept和create_time字段都符合第一个case when的要求，因此进入“then age<25”，发现也符合条件，因此收集该记录进入结果集；

剩下5条记录与上述类似，因此不再阐述。

明白了其执行过程，我们可以我们把sql改写为如下：

select * from t 
where (case when dept=2 
                 and create_time>='2018-01-01 00:00:00' 
                 and create_time<'2018-01-01 03:00:00' 
                 and age>=25 
              then 0
            when dept=5 
                 and create_time>='2018-01-01 00:00:00' 
                 and create_time<'2018-01-01 03:00:00' 
                 and age<=45 
              then 0
           else 1
       end) = 1;
复制

　　这样就很好理解了，最重要的是，该语句在Oracle和SQL Server也兼容。

虽然在where中使用case when很灵活，但我们并不推荐这样写，因为这种写法并用不上索引。

因为此时测试表t记录数太少，因此我们继续生成更多的测试数据：

#初始化序列变量
set @i=1;


#========此处拷贝反复执行15次，生成32万+的测试数据==========
insert into t(dept, age, name, create_time, last_login_time) 
select left(rand()*10,1) as dept,               #随机生成1~10的整数
       FLOOR(20+RAND() *(50 - 20 + 1)) as age,  #随机生成20~50的整数
        concat('user_',@i:=@i+1),               #按序列生成不同的name
        date_add(create_time,interval +@i*cast(rand()*100 as signed) SECOND), #生成有时间大顺序随机注册时间
        date_add(date_add(create_time,interval +@i*cast(rand()*100 as signed) SECOND), interval + cast(rand()*1000000 as signed) SECOND) #生成有时间大顺序的随机的最后登录时间
from t;
#=====================此处结束反复执行=====================


#创建索引以供测试
create index idx_create_time on t(create_time);
复制

上面测试的sql里找出符合的结果接近整个表的所有记录数，因此不可能用得上索引，为了验证在where条件中使用case when能否用上索引，我们把sql语句做一些改造，使其返回的结果集比较小：

select * from t 
where (case when dept=2 
                 and create_time>='2018-01-01 00:00:00' 
                 and create_time<'2018-01-01 03:00:00' 
                 and age<25 
              then 1
            when dept=5 
                 and create_time>='2018-01-01 00:00:00' 
                 and create_time<'2018-01-01 03:00:00' 
                 and age>45 
              then 1
           else 0
       end) = 1;
复制

执行出来的结果如下：

我们查看执行计划发现也是全表扫描的：

然后我们使用or进行改造：

select * from t 
where (  dept=2 
         and create_time>='2018-01-01 00:00:00' 
         and create_time<'2018-01-01 03:00:00' 
         and age<25
       )
      or 
      (  dept=5 
         and create_time>='2018-01-01 00:00:00' 
         and create_time<'2018-01-01 03:00:00' 
         and age>45
      )
复制