SQL之exists、in - 数据库编程

tips ：
1.外表为大表，内表为小表时，使用exist
2.外表为小表，内表为大表时，使用in

示例：外表大，内表小
create table outTable (id1 int);
insert into outtable select generate_series(1,1000000);

create table inTable (id1 int);
insert into inTable values(1),(10000),(100000),(1000000);

test=# explain select count(*) from outtable big where exists (select id1 from inTable small where small.id1=big.id1);
QUERY PLAN
-----------------------------------------------------------------------------------------
Aggregate (cost=18968.50..18968.51 rows=1 width=0)
-> Hash Join (cost=44.50..17718.50 rows=500000 width=0)
Hash Cond: (big.id1 = small.id1)
-> Seq Scan on outtable big (cost=0.00..13922.00 rows=1000000 width=4)
-> Hash (cost=42.00..42.00 rows=200 width=4)
-> HashAggregate (cost=40.00..42.00 rows=200 width=4)
-> Seq Scan on intable small (cost=0.00..34.00 rows=2400 width=4)
(7 rows)

Time: 3.743 ms
test=# explain select count(*) from outtable big where id1 in (select id1 from inTable small where small.id1=big.id1);
QUERY PLAN
------------------------------------------------------------------------------
Aggregate (cost=20032672.00..20032672.01 rows=1 width=0)
-> Seq Scan on outtable big (cost=0.00..20031422.00 rows=500000 width=0)
Filter: (SubPlan 1)
SubPlan 1
-> Seq Scan on intable small (cost=0.00..40.00 rows=12 width=4)
Filter: (id1 = big.id1)
(6 rows)

Time: 1.286 ms
test=# select count(*) from outtable big where exists (select id1 from inTable small where small.id1=big.id1);
count
-------
4
(1 row)

Time: 272.027 ms
test=# select count(*) from outtable big where id1 in (select id1 from inTable small where small.id1=big.id1);
count
-------
4
(1 row)

Time: 4021.244 ms

外表小，内表大
test=# select count(*) from intable small where exists (select id1 from outtable big where big.id1=small.id1);
count
-------
4
(1 row)

Time: 4792.643 ms
test=# select count(*) from intable small where id1 in (select id1 from outtable big where big.id1=small.id1);
count
-------
4
(1 row)

Time: 223.778 ms
test=# explain select count(*) from intable small where exists (select id1 from outtable big where big.id1=small.id1);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=33336.10..33336.11 rows=1 width=0)
-> Hash Semi Join (cost=29840.00..33333.10 rows=1200 width=0)
Hash Cond: (small.id1 = big.id1)
-> Seq Scan on intable small (cost=0.00..34.00 rows=2400 width=4)
-> Hash (cost=13922.00..13922.00 rows=1000000 width=4)
-> Seq Scan on outtable big (cost=0.00..13922.00 rows=1000000 width=4)
(6 rows)

Time: 1.021 ms
test=# explain select count(*) from intable small where id1 in (select id1 from outtable big where big.id1=small.id1);
QUERY PLAN
------------------------------------------------------------------------------
Aggregate (cost=19706446.00..19706446.01 rows=1 width=0)
-> Seq Scan on intable small (cost=0.00..19706443.00 rows=1200 width=0)
Filter: (SubPlan 1)
SubPlan 1
-> Seq Scan on outtable big (cost=0.00..16422.00 rows=1 width=4)
Filter: (id1 = small.id1)
(6 rows)

Time: 3.578 ms

1、exists 与 in 的区别
可以看出，in是一个集合运算符， a in {a1,a2,a3}
in前面是一个元素，后面是一个集合，就是判断元素是否在集合里面，是则成立。
从上面可以看出，id1 in (select id1 from outtable big where big.id1=small.id1); --如果id1 在 in 后面的集合中，则该行算在count（*）中，否则不算在count(*)中。

而exists 则为一个存在判断，如果exists后查询有结果，则为真，否则为假。
看下面的示例：
t

SQL之exists、in(一)