TSQL view select optimization when function is present

问题

I have this simple SQL as a source in a SSIS task:

Select * from budgetview

the source is:

CREATE VIEW [dbo].[BudgetView] AS
SELECT   DISTINCT  Country, 
            SDCO AS Company, 
            SDAN8 AS Customer, 
            SDLITM AS PrintableItemNumber, 
            dbo.fn_DateFromJulian(SDIVD) AS Date, 
            SDPQOR/100.0 AS Quantity, 
            SDAEXP/100.0 AS Value, 
            SDITWT/10000.0 AS Weight
FROM         dbo.F553460

There are NO advices for indexes, every thing seems optimized.

The function fn_DateFromJulian source is:

CREATE FUNCTION [dbo].[fn_DateFromJulian] 
(
    @JulianDate numeric(6,0)
)
RETURNS date
AS
BEGIN
    declare @resultdate date=dateadd(year,@JulianDate/1000,'1900-01-01')
    set @resultdate=dateadd(day,@JulianDate%1000 -1,@resultdate)
    return @resultdate

END

The problem is that i am waiting around 20 minutes just to get the rows going in SSIS....

I am waiting there 20mins BEFORE it gets started

Are there any suggestions to find the culprit?

回答1:

My assumption is that the time spent on the view is consumed by calculating the Julian date value. Without seeing the actual query plan, it seems a fair guess based on the articles below.

Rewrite the original function as a table valued function below (I've simply mashed your code together, there are likely opportunities for improvement)

CREATE FUNCTION dbo.fn_DateFromJulianTVF
(
    @JulianDate numeric(6,0)
)
RETURNS TABLE AS
RETURN
(
    SELECT dateadd(day,@JulianDate%1000 -1,dateadd(year,@JulianDate/1000,CAST('1900-01-01' AS date))) AS JDEDate
)

Usage would be

CREATE VIEW [dbo].[BudgetView] AS
SELECT   DISTINCT  Country, 
            SDCO AS Company, 
            SDAN8 AS Customer, 
            SDLITM AS PrintableItemNumber, 
            J.JDEDate AS [Date], 
            SDPQOR/100.0 AS Quantity, 
            SDAEXP/100.0 AS Value, 
            SDITWT/10000.0 AS Weight
FROM         dbo.F553460 AS T
    CROSS APPLY
        dbo.fn_DateFromJulianTVF(T.SDIVD) AS J

Scalar valued function, smell like code reuse, performs like a reused disposable diaper

https://sql.kiwi/2012/09/compute-scalars-expressions-and-execution-plan-performance.html
http://blogs.lobsterpot.com.au/2011/11/08/when-is-a-sql-function-not-a-function/

回答2:

Just checking, but am I right to understand that for every unique value of T.SDIVD there will be just one unique result value of the function ? In other words, no two different T.SDIVD will return the same value from the function?

In that case what is happening here (IMHO) is that you first do scan over the entire table, for each and every record calculate the f(SDIVD) value and then send that entire resultset through an aggregation (DISTINCT).

Since functions are far from optimal in MSSQL I'd suggest to limit their use by turning around the chain of events and doing it like this:

CREATE VIEW [dbo].[BudgetView] AS
SELECT /* DISTINCT */
                Country, 
                Company, 
                Customer, 
                PrintableItemNumber, 
                dbo.fn_DateFromJulian(SDIVD) AS Date, 
                Quantity, 
                Value, 
                Weight
          FROM (

                SELECT DISTINCT Country, 
                                SDCO AS Company, 
                                SDAN8 AS Customer, 
                                SDLITM AS PrintableItemNumber, 
                                SDIVD, 
                                SDPQOR/100.0 AS Quantity, 
                                SDAEXP/100.0 AS Value, 
                                SDITWT/10000.0 AS Weight
                           FROM dbo.F553460 ) dist_F553460
               )

If you had lots of double records this should improve performance, if you only had few of them it won't make much of a difference, if any. If you know you have no doubles at all you should get rid of the DISTINCT in the first place as that is what causing the delay!

Anyway, regarding the function you can add the following trick:

CREATE FUNCTION [dbo].[fn_DateFromJulian] 
(
    @JulianDate numeric(6,0)
)
RETURNS date
WITH SCHEMABINDING
AS
BEGIN
    declare @resultdate date=dateadd(year,@JulianDate/1000,'1900-01-01')
    set @resultdate=dateadd(day,@JulianDate%1000 -1,@resultdate)
    return @resultdate

END

The WITH SCHEMABINDING causes some internal optimisations that will make its execution slightly faster, YMMV. There are limitations to it, but here it will work nicely.

Edit: removed the 'outer' DISTINCT since it's (likely, cf my first assumption) not needed.

来源：https://stackoverflow.com/questions/27906918/tsql-view-select-optimization-when-function-is-present

标签

tsql

ssis

query-optimization