问题
I am returning single value from a scaler function as below:
CREATE FUNCTION [dbo].[GetNoOfAssignedCases]
(
@UserID INT,
@FromD DATETIME,
@ToD DATETIME
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE @CaseCount INT = 0
SELECT @CaseCount = COUNT(1) FROM Cases
WHERE
CaseAssignedToAssessor = @UserID AND
CAST(ActionDateTime AS DATE) >= @FromD AND
CAST(ActionDateTime AS DATE) <= @ToD
RETURN @CaseCount
END
And using it as below:
SELECT [Name], [DBO].[GetNoOfAssignedCases](UserID, GETDATE()-30, GETDATE()) FROM Users
Can it be replace with table valued function? And will it have any performance impact? Which will be faster?
回答1:
My discussion with Jonatha Dickinson (look at his answer) brought me to do some quick tests:
It comes out, that the pure,embedded scalar sub-select is not that bad. Querying just one value it's even the fastest. As expected the Scalar Function is bad. The more fields are given back by the TVF the better is the relativ performance gain.
The only sure answer is: The scalar function is the worst and a multi-line TVF is - most of the time - slower than inline. Any ad-hoc approach tends to be faster.
But I could set up special cases to all situations (except the scalar function), where one approach was the fastest.
Conclusion: (as always :-) ) It depends...
Hint: Best is to let this go against a big database with many tables and columns.
CREATE FUNCTION dbo.CountColumnScalar(@TableSchema AS VARCHAR(100),@TableName AS VARCHAR(100))
RETURNS INT
AS
BEGIN
RETURN(SELECT COUNT(*) FROM INFORMATION_SCHEMA.COLUMNS AS c WHERE c.TABLE_SCHEMA=@TableSchema AND c.TABLE_NAME=@TableName);
END
GO
CREATE FUNCTION dbo.CountConstraintScalar(@TableSchema AS VARCHAR(100),@TableName AS VARCHAR(100))
RETURNS INT
AS
BEGIN
RETURN(SELECT COUNT(*) FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS c WHERE c.TABLE_SCHEMA=@TableSchema AND c.TABLE_NAME=@TableName);
END
GO
CREATE FUNCTION dbo.CountAllTVF(@TableSchema AS VARCHAR(100),@TableName AS VARCHAR(100))
RETURNS TABLE
RETURN SELECT (SELECT COUNT(*) FROM INFORMATION_SCHEMA.COLUMNS AS c WHERE c.TABLE_SCHEMA=@TableSchema AND c.TABLE_NAME=@TableName GROUP BY c.TABLE_SCHEMA,c.TABLE_NAME) AS ColCounter
,(SELECT COUNT(*) FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS c WHERE c.TABLE_SCHEMA=@TableSchema AND c.TABLE_NAME=@TableName GROUP BY c.TABLE_SCHEMA,c.TABLE_NAME) AS ConstraintCounter ;
GO
CREATE FUNCTION dbo.CountAllTVF_multiline(@TableSchema AS VARCHAR(100),@TableName AS VARCHAR(100))
RETURNS @tbl TABLE (ColCounter INT,ConstraintCounter INT)
AS
BEGIN
INSERT INTO @tbl
SELECT (SELECT COUNT(*) FROM INFORMATION_SCHEMA.COLUMNS AS c WHERE c.TABLE_SCHEMA=@TableSchema AND c.TABLE_NAME=@TableName GROUP BY c.TABLE_SCHEMA,c.TABLE_NAME) AS ColCounter
,(SELECT COUNT(*) FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS c WHERE c.TABLE_SCHEMA=@TableSchema AND c.TABLE_NAME=@TableName GROUP BY c.TABLE_SCHEMA,c.TABLE_NAME) AS ConstraintCounter;
RETURN;
END
GO
DECLARE @time DATETIME=GETDATE();
SELECT TABLE_SCHEMA,TABLE_NAME
,(SELECT COUNT(*) FROM INFORMATION_SCHEMA.COLUMNS AS c WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA AND c.TABLE_NAME=t.TABLE_NAME ) AS ColCounter
,(SELECT COUNT(*) FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS c WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA AND c.TABLE_NAME=t.TABLE_NAME ) AS ConstraintCounter
FROM INFORMATION_SCHEMA.TABLES AS t;
PRINT 'pure embedded scalar sub-select: ' + CAST(CAST(GETDATE()-@time AS TIME) AS VARCHAR(MAX)); SET @time=GETDATE();
SELECT TABLE_SCHEMA,TABLE_NAME
,dbo.CountColumnScalar(t.TABLE_SCHEMA,t.TABLE_NAME ) AS ColCounter
,dbo.CountConstraintScalar(t.TABLE_SCHEMA,t.TABLE_NAME ) AS ConstraintCount
FROM INFORMATION_SCHEMA.TABLES AS t
PRINT 'scalar function: ' + CAST(CAST(GETDATE()-@time AS TIME) AS VARCHAR(MAX)); SET @time=GETDATE();
SELECT t.TABLE_SCHEMA,t.TABLE_NAME
,colJoin.ColCount
,conJoin.ConstraintCount
FROM INFORMATION_SCHEMA.TABLES AS t
INNER JOIN (SELECT COUNT(*) As ColCount,c.TABLE_SCHEMA,c.TABLE_NAME
FROM INFORMATION_SCHEMA.COLUMNS AS c
GROUP BY c.TABLE_SCHEMA,c.TABLE_NAME) AS colJoin ON colJoin.TABLE_SCHEMA=t.TABLE_SCHEMA AND colJoin.TABLE_NAME=t.TABLE_NAME
INNER JOIN (SELECT COUNT(*) As ConstraintCount,c.TABLE_SCHEMA,c.TABLE_NAME
FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS c
GROUP BY c.TABLE_SCHEMA,c.TABLE_NAME) AS conJoin ON conJoin.TABLE_SCHEMA=t.TABLE_SCHEMA AND conJoin.TABLE_NAME=t.TABLE_NAME
PRINT 'JOINs on sub-selects: ' + CAST(CAST(GETDATE()-@time AS TIME) AS VARCHAR(MAX)); SET @time=GETDATE();
SELECT t.TABLE_SCHEMA,t.TABLE_NAME
,ColCounter.*
FROM INFORMATION_SCHEMA.TABLES AS t
CROSS APPLY dbo.CountAllTVF(t.TABLE_SCHEMA,t.TABLE_NAME) AS ColCounter
PRINT 'TVF inline: ' + CAST(CAST(GETDATE()-@time AS TIME) AS VARCHAR(MAX)); SET @time=GETDATE();
SELECT t.TABLE_SCHEMA,t.TABLE_NAME
,ColCounter.*
FROM INFORMATION_SCHEMA.TABLES AS t
CROSS APPLY dbo.CountAllTVF_multiline(t.TABLE_SCHEMA,t.TABLE_NAME) AS ColCounter
PRINT 'TVF multiline: ' + CAST(CAST(GETDATE()-@time AS TIME) AS VARCHAR(MAX)); SET @time=GETDATE();
GO
DROP FUNCTION dbo.CountColumnScalar;
DROP FUNCTION dbo.CountAllTVF;
DROP FUNCTION dbo.CountAllTVF_multiline;
DROP FUNCTION dbo.CountConstraintScalar;
回答2:
Yes, you could so something like this (untested)
CREATE FUNCTION [dbo].[GetNoOfAssignedCases]
(
@UserID INT,
@FromD DATETIME,
@ToD DATETIME
)
RETURNS TABLE
AS
RETURN
SELECT COUNT(1) AS CaseCount
FROM Cases
WHERE
CaseAssignedToAssessor = @UserID AND
CAST(ActionDateTime AS DATE) >= @FromD AND
CAST(ActionDateTime AS DATE) <= @ToD;
GO
--And a call like this
SELECT [Name],CaseCounter.CaseCount
FROM Users
OUTER APPLY [DBO].[GetNoOfAssignedCases](UserID, GETDATE()-30, GETDATE()) AS CaseCounter
But - if you really need nothing more than a scalar value! - I don't know why...
Reasons why a TVF is a good idea:
- You want to get the "scalar" value for many rows in one go
- You want - maybe later - get more values out of this function
回答3:
will it have any performance impact?
Yes. We use functions quite extensively. While optimizing a specific query I found that SQL optimizes scalar-valued functions as a separate entity, where TVFs are "inlined" into the main query and then optimized as a whole. Exceptions to this may exist but we have found that TVFs are universally faster (only slightly slower than inlining the function yourself).
Can it be replace with table valued function?
Yes. If it's performance you are worried about this is the format that you should use:
CREATE FUNCTION [dbo].[GetNoOfAssignedCases]
(
@UserID INT,
@FromD DATETIME,
@ToD DATETIME
)
RETURNS TABLE
AS RETURN
SELECT COUNT(1) AS Count FROM Cases
WHERE
CaseAssignedToAssessor = @UserID AND
CAST(ActionDateTime AS DATE) >= @FromD AND
CAST(ActionDateTime AS DATE) <= @ToD;
Cross apply can then be used to execute the function:
SELECT [Name], [AC].[Count] FROM Users
CROSS APPLY [DBO].[GetNoOfAssignedCases](UserID, GETDATE()-30, GETDATE()) AS [AC]
You need to RETURN SELECT
for the "inlining" to occur. In addition, it is your job to ensure that these functions don't return more than one record (unless you actually want the CROSS APPLY
behavior - which you almost never do).
回答4:
Table Valued Function will go like this:
CREATE FUNCTION [dbo].[GETNOOFASSIGNEDCASES] (@UserID INT,
@FromD DATETIME,
@ToD DATETIME)
RETURNS @CaseCount TABLE (
cnt INT NULL )
AS
BEGIN
INSERT INTO @CaseCount
(cnt)
SELECT Count(1)
FROM Cases
WHERE CaseAssignedToAssessor = @UserID
AND Cast(ActionDateTime AS DATE) >= @FromD
AND Cast(ActionDateTime AS DATE) <= @ToD
END
You should invoke the above defined table valued function using cross apply or outer apply:
SELECT [Name],tmp.cnt
FROM Users
CROSS apply [DBO].[GETNOOFASSIGNEDCASES](UserID, Getdate() - 30, Getdate()) as tmp
来源:https://stackoverflow.com/questions/33993832/table-valued-function-vs-scalar-values-function-for-single-return-value