Use Boolean algebra in tsql to avoid CASE statement or deal complex WHERE conditions

余生颓废 提交于 2019-12-06 08:42:30

问题


I came across a scenario,I will explain it with some dummy data. See the table Below

Select * from LUEmployee

empId   name    joiningDate
1049    Jithin  3/9/2009
1017    Surya   1/2/2008
1089    Bineesh 8/24/2009
1090    Bless   7/15/2009
1014    Dennis  1/5/2008
1086    Sus     9/10/2009

I need to increment the year column by 1, only If the months are Jan, Mar, July Or Dec.

empId   name    joiningDate derived Year
1049    Jithin  3/9/2009    2010
1017    Surya   1/2/2008    2009
1089    Bineesh 8/24/2009   2009
1090    Bless   7/15/2009   2010
1014    Dennis  1/5/2008    2009
1086    Sus     9/10/2009   2009

derived Year is the required column

We were able to achieve this easily with a case statement like below

Select *,
YEAR(joiningDate) + CASE WHEN MONTH(joiningDate) in (1,3,7,12) THEN 1 ELSE 0 END 
from LUEmployee

But there came an added condition from onsite PM, Dont use CASE statement, CASE is inefficient. Insearch of a soultion, We resulted in a following solution, a solution using binary K-map, As follows


If number 1 to 12 represents months from Jan to Dec, See the binary result

the Karnaugh Map way of expressing is given below.

the result will be

We need to realize the expression with sql server binary operations

eg: binary of 12 = 1100
    in the k-map, a = 1, b = 1, c = 0, d = 0
    Similarly, binary of 7 = 0111
    in the k-map, a = 0, b = 1, c = 1, d = 1

to get the left most bit (d), we will have to shift the bit towards right by 3 positions and the mask all the bits except LSB.

eg: ((MONTH(joiningDate)/8)&1)

Similarly, second bit from left (c), we need to shift the bit towards right by 2 positions and then mask all the bits except LSB

eg: ((MONTH(joiningDate)/4)&1)

Finally, each bit can be represented as

so  a = ((MONTH(joiningDate)/8)&1)
    b = ((MONTH(joiningDate)/4)&1)
    c = ((MONTH(joiningDate)/2)&1)
    d = (MONTH(joiningDate)&1)

a inverse = (((MONTH(joiningDate)/8)&1)^1)
b inverse = (((MONTH(joiningDate)/4)&1)^1)
c inverse = (((MONTH(joiningDate)/2)&1)^1)
d inverse = ((MONTH(joiningDate)&1)^1)

The final code will be

SELECT  *,
        YEAR(joiningDate) + CAST(
        ((MONTH(joiningDate)/8)&1)*((MONTH(joiningDate)/4)&1)*(((MONTH(joiningDate)/2)&1)^1)*((MONTH(joiningDate)&1)^1) |
        (((MONTH(joiningDate)/8)&1)^1)*(((MONTH(joiningDate)/4)&1)^1)*(MONTH(joiningDate)&1) |
        (((MONTH(joiningDate)/8)&1)^1)*((MONTH(joiningDate)/2)&1)*(MONTH(joiningDate)&1) 
        AS INT) [derivedYear]
FROM    LUEmployee

Result will be


Question: There may be simple and less complex ideas, please share it.

I like to find a simpler one ,as well as share the idea.Here the possible conditions are 12 (12 months). We can use k-map for even bigger number of conditions..Felt like k-map is convenient for up to 64 conditions.


回答1:


My first reaction would be to defend the use of the case clause in this case. But if you are absolutely not allowed to use it, maybe you could simply add a table with the month and increment values:

LUMonthIncrement

Month   Increment
 1      1  
 2      0  
 3      1  
 4      0  
 5      0  
 6      0  
 7      1  
 8      0  
 9      0  
10      0  
11      0  
12      1  

Then you can join in that table and just add the increment:

Select LUEmployee.*,
    YEAR(joiningDate) + LUMonthIncrement.Increment as derivedYear
from LUEmployee
    join LUMonthIncrement on MONTH(LUEmployee.joiningDate) = LUMonthIncrement.Month

This is unlikely to be much more performant though, because in order to join to LUMonthIncrement the MONTH(LUEmployee.joiningDate) expression must be evaluated for each row in the LUEmployee table.




回答2:


In this specific case you could do a UNION as you got 2 distinct subsets of your input set that don't depend on each other and the split criteria are well defined. So you could do something like:

Select *,
YEAR(joiningDate) + 1 as derived_year 
from LUEmployee
WHERE MONTH(joiningDate) = 1 OR MONTH(joiningDate) = 3 OR MONTH(joiningDate) = 7 OR MONTH(joiningDate) = 12

UNION 

Select *,
YEAR(joiningDate) as derived_year 
from LUEmployee
WHERE NOT (MONTH(joiningDate) = 1 OR MONTH(joiningDate) = 3 OR MONTH(joiningDate) = 7 OR MONTH(joiningDate) = 12)



回答3:


Taking @user1429080's concept of a Month table one step farther, and turn it into a range table; this will allow for the elimination of the call to MONTH() in the join. Assuming you have a Calendar table (which are stupid useful), you can build the query like this:

WITH LUMonthIncrement AS (SELECT month, increment
                          FROM (VALUES (1, 1),
                                       (2, 0),
                                       (3, 1),
                                       (4, 0),
                                       (5, 0),
                                       (6, 0),
                                       (7, 1),
                                       (8, 0),
                                       (9, 0),
                                       (10, 0),
                                       (11, 0),
                                       (12, 1)) m(month, increment))  

SELECT LUEmployee.empId, LUEmployee.name, LUEmployee.joiningDate, IncrementRange.year
FROM LUEmployee
JOIN (SELECT Calendar.calendarDate AS rangeStart, 
             DATEADD(month, 1, Calendar.calendarDate) AS rangeEnd,
             Calendar.year + LUMonthIncrement.increment AS year
      FROM Calendar
      JOIN LUMonthIncrement
        ON LUMonthIncrement.month = Calender.month
      WHERE Calendar.dayOfMonth = 1) IncrementRange
  ON LUEmployee.joiningDate >= IncrementRange.rangeStart
     AND LUEmployee.joiningDate < IncrementRange.rangeEnd

(Untested at the moment)

Yes, I'm still using an index-ignoring function (specifically, DATEADD(...)) - however, the subquery reference is likely to execute first, and will return 12 rows per-year, and the join to LUEmployee is free to use any index on that table (which is likely to be far larger than the result of the subselect). Assuming Calendar has an index starting with dayOfMonth (it's a dimension table, it should...), IncrementRange should be built instantaneously.

(Note that I'm using a general range form here, which will be useful when dealing with types with a time portion attached. This is handy for things like aggregating sales by month... If you're using 2012 with a strict date type, you could potentially just straight join to the Calendar table directly on the date value, and skip dealing with the range.)




回答4:


If you want to use bit logic here is a way

SELECT [empId], [name], [joiningDate]
     , [derived Year]
     = YEAR(joiningDate)
     + (1 - cast(MONTH(joiningDate) / 8 as bit)) * (MONTH(joiningDate) % 2)
     - (cast(MONTH(joiningDate) / 5 as bit))
     * (1 - cast(MONTH(joiningDate) / 6 as bit))
     + (cast(MONTH(joiningDate) / 12 as bit))
FROM   LUEmployee

SQLFiddle demo with data expanded to have every month available

Explaining the bits

  • (1 - cast(MONTH(joiningDate) / 8 as bit)) * (MONTH(joiningDate) % 2) the first part return 1 for month (number) less then 8, the second part check the parity where 1 is odd, together they add 1 for 1,3, 5, 7; to remove the 5 we need
  • (cast(MONTH(joiningDate) / 5 as bit)) * (1 - cast(MONTH(joiningDate) / 6 as bit)) the first part return 1 for every value higher or equal 5, the second part return 1 per every value less then 6, the only intersection is 5
  • (cast(MONTH(joiningDate) / 12 as bit) return 1 only for december

With all the option here, if I were in your position, I would check them all for performance and report back to my PM with the data, I'm quite sure there is a lesson to learn.



来源:https://stackoverflow.com/questions/23757036/use-boolean-algebra-in-tsql-to-avoid-case-statement-or-deal-complex-where-condit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!