问题
Ok, i have a following code to mark records that have highest month_cd in tabl with binary flag:
Select t1.month_cd, t2.max_month_cd
,CASE WHEN t2.max_month_cd != null then 0 else 1 end test_1
,CASE WHEN t2.max_month_cd = null then 0 else 1 end test_2
from source t1
Left join (
Select
MAX(month_cd) as max_month_cd
From source
) t2
on t1.month_cd = t2.max_month_cd;
It seems straight forward to me, but result it return is:
month_cd max_month_cd test_1 test_2
201610 null 1 1
201611 201611 1 1
Makes zero sense to me, and seems to be way too obvious to be a bug in execution engine. What am i missing?
回答1:
This is all about NULL concept.
Since Null is not a member of any data domain, it is not considered a "value", but rather a marker (or placeholder) indicating the absence of value. Because of this, comparisons with Null can never result in either True or False, but always in a third logical result, Unknown. NULL is nothing, absence of object. So, nothing can NOT be equal to NULL or something else. In SQL there are IS NULL
and IS NOT NULL
conditions to be used for test for null values.
In your CASE the result of logical expression is unknown, so ELSE value is assigned.
Corrected version:
CASE WHEN t2.max_month_cd IS NOT null then 0 else 1 end test_1,
CASE WHEN t2.max_month_cd IS null then 0 else 1 end test_2
See this for reference: Null (SQL)
来源:https://stackoverflow.com/questions/41023835/case-statements-in-hive