SQL String comparison speed 'like' vs 'patindex'

前端未结

关注

 3  821

无人共我 2020-12-05 15:37

I had a query as follows (simplified)...

SELECT     *
FROM       table1 AS a
INNER JOIN table2 AS b ON (a.name LIKE \'%\' + b.name + \'%\')

3条回答

北荒 (楼主)

2020-12-05 15:51
I'm not at all convinced by the thesis that it is the extra overhead of the LikeRangeStart, LikeRangeEnd, LikeRangeInfo functions that is responsible for the time discrepancy.

It is simply not reproducible (at least in my test, default collation etc). When I try the following
```
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;

DECLARE @T TABLE (name sysname )
INSERT INTO @T
SELECT TOP 2500 name + '...' + 
   CAST(ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS VARCHAR)
FROM sys.all_columns

SET STATISTICS IO ON;
SET STATISTICS TIME ON;
PRINT '***'
SELECT     COUNT(*)
FROM       @T AS a
INNER JOIN @T AS b ON (a.name LIKE '%' + b.name + '%')

PRINT '***'
SELECT     COUNT(*)
FROM       @T AS a
INNER JOIN @T AS b ON (PATINDEX('%' + b.name + '%', a.name) > 0)
```
Which gives essentially the same plan for both but also contains these various internal functions I get the following.

LIKE
```
Table '#5DB5E0CB'. Scan count 2, logical reads 40016
CPU time = 26953 ms,  elapsed time = 28083 ms.
```
PATINDEX
```
Table '#5DB5E0CB'. Scan count 2, logical reads 40016
CPU time = 28329 ms,  elapsed time = 29458 ms.
```
I do notice however that if I substitute a #temp table instead of the table variable the estimated number of rows going into the stream aggregate is significantly different.

The LIKE version has an estimated 330,596 and PATINDEX an estimated 1,875,000.

I notice you also have a hash join in your plan. Possibly because the PATINDEX version seems to estimate a greater number of rows than LIKE this query gets a larger memory grant so doesn't have to spill the hash operation to disc. Try tracing the hash warnings in Profiler to see if this is the case.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

SQL String comparison speed 'like' vs 'patindex'

LIKE

PATINDEX