问题
I asked this question before here, but the answer actually wasn't what I was looking for.
Suppose I have the following two tables in my SQL Server (2012) DB:
Tbl1:
ID: Col1: Col2: Col3:
1 Val11 Val21 Val31
2 <NULL> Val21 <NULL>
3 Val11 <NULL> Val31
4 Val11 <NULL> Val31
Tbl2:
ID: Col1: Col2: Col3:
1 Val12 Val22 Val32
2 <NULL> Val22 <NULL>
3 <NULL> <NULL> Val32
5 <NULL> <NULL> Val32
And, at this point, all I want to see is:
- Any rows that are in one table but not the other (based on
ID
pk) - If the
ID
is in both tables, then are the same columns populated (not caring specifically about the values yet).
I'm just trying to come up with a SQL to just let me know which ID
s have discrepancies.
My ideal output would look as follows:
Tbl1_ID: Tbl2_Id: Discrepancy:
1 1 0
2 2 0
3 3 1
4 <NULL> 1
<NULL> 5 1
My testing SQL so far is this:
DECLARE
@Tbl1 TABLE (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(10), Col3 VARCHAR(10))
;
DECLARE
@Tbl2 TABLE (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(10), Col3 VARCHAR(10))
;
INSERT INTO @Tbl1 (ID, Col1, Col2, Col3)
VALUES
(1, 'Val11', 'Val21', 'Val31')
,(2, NULL , 'Val21', NULL)
,(3, 'Val11', NULL, 'Val31')
,(4, 'Val11', NULL, 'Val31')
;
INSERT INTO @Tbl2 (ID, Col1, Col2, Col3)
VALUES
(1, 'Val12', 'Val22', 'Val32')
,(2, NULL , 'Val22', NULL)
,(3, NULL, NULL, 'Val32')
,(5, NULL, NULL, 'Val32')
;
SELECT
[@Tbl1].ID AS Tbl1_ID
,[@Tbl2].ID AS Tbl2_ID
, -- Some kind of comparison to let me know if all columns are populated the same
AS Discrepancy
FROM
@Tbl1
FULL JOIN @Tbl2
ON [@Tbl1].ID = [@Tbl2].ID
As you can see in the previous question's answer, the solution proposed (and I didn't check it well enough) was doing an ISNULL(Tbl1.Col1, xx) = ISNULL(Tbl2.Col1, xx)
kind of comparison for the columns I'm looking for. The problem there is that it is, in fact, checking the values of the two tables against each other. All I want to do is check if they are both populated or not, without any need to do a value-comparison.
I know I could accomplish this with something along the lines of CASE WHEN Tbl1.Col1 is NULL THEN 1 ELSE 0 END = CASE WHEN Tbl2.Col1 IS NULL THEN 1 ELSE 0 END
, but I'm hoping there is a nicer way to do this since I am checking a lot of columns.
Is there some good way to accomplish this?
Thanks!!
回答1:
You could use:
SELECT
t1.ID AS Tbl1_ID
,t2.ID AS Tbl2_ID
,CASE WHEN NOT EXISTS (
-- here use template and put column list from table 1
SELECT CASE WHEN t1.Col1 IS NOT NULL THEN -1 ELSE 0 END,
CASE WHEN t1.Col2 IS NOT NULL THEN -1 ELSE 0 END,
CASE WHEN t1.Col3 IS NOT NULL THEN -1 ELSE 0 END
INTERSECT
-- here use template and put column list from table 2
SELECT CASE WHEN t2.Col1 IS NOT NULL THEN -1 ELSE 0 END,
CASE WHEN t2.Col2 IS NOT NULL THEN -1 ELSE 0 END,
CASE WHEN t2.Col3 IS NOT NULL THEN -1 ELSE 0 END)
THEN 1 ELSE 0 END AS Discrepancy
FROM @Tbl1 t1
FULL JOIN @Tbl2 t2
ON t1.ID = t2.ID;
Rextester.com Demo
It is a variant of IS DISTINCT FROM
. If you care about actual values then:
SELECT
t1.ID AS Tbl1_ID
,t2.ID AS Tbl2_ID
,CASE WHEN NOT EXISTS (
SELECT t1.Col1, t1.Col2, t1.Col3
INTERSECT
SELECT t2.Col1, t2.Col2, t2.Col3)
THEN 1 ELSE 0 END AS Discrepancy
FROM @Tbl1 t1
FULL JOIN @Tbl2 t2
ON t1.ID = t2.ID;
回答2:
You mentioned "not checking values yet".
Here is one approach which will check both PK and values (assuming the tables have identical structures)
I forced ID 1 to be identical on both tables just for illustration
Example
DECLARE @Tbl1 TABLE (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(10), Col3 VARCHAR(10));
DECLARE @Tbl2 TABLE (ID INT, Col1 VARCHAR(10), Col2 VARCHAR(10), Col3 VARCHAR(10));
INSERT INTO @Tbl1 (ID, Col1, Col2, Col3) VALUES
(1, 'Val11', 'Val21', 'Val31')
,(2, NULL , 'Val21', NULL)
,(3, 'Val11', NULL, 'Val31')
,(4, 'Val11', NULL, 'Val31')
INSERT INTO @Tbl2 (ID, Col1, Col2, Col3) VALUES
(1, 'Val11', 'Val21', 'Val31')
,(2, NULL , 'Val22', NULL)
,(3, NULL, NULL, 'Val32')
,(5, NULL, NULL, 'Val32')
Select T1=max(case when Src='T1' then ID end)
,T2=max(case when Src='T2' then ID end)
,ID_Discrepancy = case when max(case when Src='T1' then ID end)=max(case when Src='T2' then ID end) then 0 else 1 end
,Value_Discrepancy = case when max(case when Src='T1' then Test else '' end) = max(case when Src='T2' then Test else '' end) then 0 else 1 end
From (
Select Src='T1',ID,Test=(Select A1.* for XML Raw) From @Tbl1 A1
Union All
Select Src='T2',ID,Test=(Select A2.* for XML Raw) From @Tbl2 A2
) A
Group By ID
Returns
T1 T2 ID_Discrepancy Value_Discrepancy
1 1 0 0
2 2 0 1
3 3 0 1
4 NULL 1 1
NULL 5 1 1
回答3:
Not a whole new approach, but you could shorten the code with the help of a function which returns 0 and 1 depending on whether both arguments are null or both are not null
create function discr(@a varchar, @b varchar) returns int
as
begin
return iif((@a is null and not @b is null) or (@b is null and not @a is null),1,0)
end
and then
select sign(
dbo.discr(a.Id, b.Id) +
dbo.discr(a.col1, b.col1) +
dbo.discr(a.col2, b.col2) +
dbo.discr(a.col3, b.col3))
from @tbl1 a full outer join @tbl2 b on a.Id = b.Id
回答4:
Well, I like Lukasz's answer best, and also that John has an XML based solution (as usual), but it was a nice question that got me thinking about if I can come up with a new and creative way to solve this, (no magic strings allowed, of course, that's just cheating) - so I came up with this idea. I'll be the first one to admit it's not as elegant as other solutions - but it was fun writing:
SELECT T1.ID As Tbl1_ID,
T2.ID As Tbl2_ID,
CASE WHEN
T1.ID = T2.ID AND
ISNULL(SIGN(LEN(T1.COL1)),-1) = ISNULL(SIGN(LEN(T2.COL1)),-1) AND
ISNULL(SIGN(LEN(T1.COL2)),-1) = ISNULL(SIGN(LEN(T2.COL2)),-1) AND
ISNULL(SIGN(LEN(T1.COL3)),-1) = ISNULL(SIGN(LEN(T2.COL3)),-1)
THEN
1
ELSE
0
END
FROM @Tbl1 T1
FULL JOIN @Tbl2 T2 ON T1.ID = T2.ID
And just for fun, with no practical use - a version only using ISNULL, NULLIF, COALESCE and SIGN:
SELECT T1.ID As Tbl1_ID,
T2.ID As Tbl2_ID,
ISNULL(
NULLIF(
SIGN(
COALESCE(
ISNULL(NULLIF(T1.ID, T2.ID), NULLIF(T2.ID, T1.ID)), -- Ids
NULLIF(ISNULL(SIGN(LEN(T1.COL1)),-1), ISNULL(SIGN(LEN(T2.COL1)),-1)), -- Col1
NULLIF(ISNULL(SIGN(LEN(T1.COL2)),-1), ISNULL(SIGN(LEN(T2.COL2)),-1)), -- Col2
NULLIF(ISNULL(SIGN(LEN(T1.COL3)), -1), ISNULL(SIGN(LEN(T2.COL3)),-1)), -- Col3
-2)
),
-1),
0)
FROM @Tbl1 T1
FULL JOIN @Tbl2 T2 ON T1.ID = T2.ID
来源:https://stackoverflow.com/questions/51972449/sql-server-compare-2-tables-for-data-in-the-same-columns-without-checking-for