问题
I have the following table:
ID Group Col1 Col2 Col3 Col4 ... ColN
------------------------------------------------------
1 AAA foo bar
2 AAA 123 far baz
3 BBB
4 CCC 345 123
5 AAA caz
For each Group
, I need to find out in what Col
umns it has a value. I do not care about the values themselves.
Example:
Group
AAA has 3ID
s in it:1, 2, 5
.ID
1 has a value inCol1, Col2
.ID
2 has a value inCol1, Col2
.ID
5 has a value inCol3
so in total,Group
AAA has values inCol1, Col2, Col3
The output should then be like this (also listing Group
s that have no value for any Col
umn as null
:
Group Cols
------------------------------
AAA Col1; Col2; Col3;Col4
BBB null
CCC Col3
I have hundreds of columns, and hundred of thousands of records.
Can anyone help me get started? I don't know how I can dynamically iterate through all the column names and list them.
回答1:
Due to the OP's comment that they have 100's of columns, this suggests that they need a dynamic Solution. I finished this solution just as the OP commented that they are using 2016, so this will not work on 2016. They OP will need to convert this to the older FOR XML PATH
and STUFF
method instead of using STRING_AGG
.
Other than that, this works:
USE Sandbox;
GO
CREATE TABLE dbo.YourTable (ID int,
[Group] char(3),
Col1 char(3),
Col2 char(3),
Col3 char(3),
Col4 char(3));
GO
INSERT INTO dbo.YourTable
VALUES(1,'AAA','foo','bar',NULL,NULL),
(2,'AAA','123','far',NULL,'baz'),
(3,'BBB',NULL,NULL,NULL,NULL),
(4,'CCC','345','123',NULL,NULL),
(5,'AAA',NULL,NULL,'czx',NULL);
GO
--Hard coded example, to get the idea correct first
WITH UnPvt AS(
SELECT DISTINCT
YT.[Group],
V.ColumnName
FROM dbo.YourTable YT
CROSS APPLY (VALUES(N'Col1',Col1),
(N'Col2',Col2),
(N'Col3',Col3),
(N'Col4',Col4))V(ColumnName,ColumnValue)
WHERE V.ColumnValue IS NOT NULL)
SELECT YT.[Group],
STRING_AGG(U.ColumnName,'; ') WITHIN GROUP (ORDER BY U.ColumnName) AS Cols
FROM (SELECT DISTINCT [Group] FROM dbo.YourTable) YT
LEFT JOIN UnPvt U ON YT.[Group] = U.[Group]
GROUP BY YT.[group]
GO
--Dynamic Solution
DECLARE @SchemaName sysname = N'dbo',
@TableName sysname = N'YourTable';
DECLARE @SQL nvarchar(MAX),
@CRLF nchar(2) = NCHAR(13) + NCHAR(10);
DECLARE @Delimiter nvarchar(50) = N',' + @CRLF + N' ';
SET @SQL = N'WITH UnPvt AS(' + @CRLF +
N' SELECT DISTINCT' + @CRLF +
N' YT.[Group],' + @CRLF +
N' V.ColumnName' + @CRLF +
N' FROM dbo.YourTable YT' + @CRLF +
N' CROSS APPLY (VALUES' +
(SELECT STRING_AGG(N'(N' + QUOTENAME(c.[name],'''') + N',' + QUOTENAME(c.[name]) + N')',@Delimiter) WITHIN GROUP (ORDER BY C.[name])
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.OBJECT_ID
WHERE s.[name] = @SchemaName
AND t.[name] = @TableName
AND C.[name] NOT IN (N'ID',N'Group')) + N')V(ColumnName,ColumnValue)' + @CRLF +
N' WHERE V.ColumnValue IS NOT NULL)' + @CRLF +
N'SELECT YT.[Group],' + @CRLF +
N' STRING_AGG(U.ColumnName,''; '') WITHIN GROUP (ORDER BY U.ColumnName) AS Cols' + @CRLF +
N'FROM (SELECT DISTINCT [Group] FROM dbo.YourTable) YT' + @CRLF +
N' LEFT JOIN UnPvt U ON YT.[Group] = U.[Group]' + @CRLF +
N'GROUP BY YT.[group];';
PRINT @SQL;
EXEC sp_executesql @SQL;
GO
DROP TABLE dbo.YourTable;
DB<>Fiddle
Note that this assumes that all columns (apart from ID
and Group
) have the same data type as well.
Edit: Sigh... FOR XML PATH
solution:
DECLARE @SchemaName sysname = N'dbo',
@TableName sysname = N'YourTable';
DECLARE @SQL nvarchar(MAX),
@CRLF nchar(2) = NCHAR(13) + NCHAR(10);
DECLARE @Delimiter nvarchar(50) = N',' + @CRLF + N' ';
SET @SQL = N'WITH UnPvt AS(' + @CRLF +
N' SELECT DISTINCT' + @CRLF +
N' YT.[Group],' + @CRLF +
N' V.ColumnName' + @CRLF +
N' FROM dbo.YourTable YT' + @CRLF +
N' CROSS APPLY (VALUES' +
STUFF((SELECT @Delimiter + N'(N' + QUOTENAME(c.[name],'''') + N',' + QUOTENAME(c.[name]) + N')'
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.OBJECT_ID
WHERE s.[name] = @SchemaName
AND t.[name] = @TableName
AND C.[name] NOT IN (N'ID',N'Group')
ORDER BY c.[name]
FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,DATALENGTH(@Delimiter)/2,N'') + N')V(ColumnName,ColumnValue)' + @CRLF +
N' WHERE V.ColumnValue IS NOT NULL)' + @CRLF +
N'SELECT YT.[Group],' + @CRLF +
N' STUFF((SELECT N''; '' + ColumnName' + @CRLF +
N' FROM UnPvt U' + @CRLF +
N' WHERE U.[Group] = YT.[Group]' + @CRLF +
N' ORDER BY U.ColumnName' + @CRLF +
N' FOR XML PATH(''''),TYPE).value(''.'',''nvarchar(MAX)''),1,2,N'''') AS Cols' + @CRLF +
N'FROM (SELECT DISTINCT [Group] FROM dbo.YourTable) YT' + @CRLF +
N'GROUP BY YT.[group];';
PRINT @SQL;
EXEC sp_executesql @SQL;
回答2:
One method uses concat_ws()
:
select t.grp,
concat_ws(',',
(case when max(col1) is not null then 'col1' end),
(case when max(col2) is not null then 'col2' end),
. . . -- fill in the logic for the rest of the columns
) as columns
from t
group by grp;
Note: concat_ws()
was introduced in SQL Server 2017. You can do something similar in older versions.
回答3:
Performance will be bad, but this works generically:
DECLARE @mockupTable TABLE (ID INT,[Group] VARCHAR(10),Col1 VARCHAR(10),Col2 VARCHAR(10),Col3 VARCHAR(10),Col4 VARCHAR(10));
INSERT INTO @mockupTable VALUES
(1,'AAA','foo','bar',NULL,NULL)
,(2,'AAA','123','far',NULL,'baz')
,(3,'BBB',NULL,NULL,NULL,NULL)
,(4,'CCC','345','123',NULL,NULL)
,(5,'AAA',NULL,NULL,'caz',NULL);
--the query
SELECT rw.query('for $n in * return concat(",",local-name($n))').value('.','nvarchar(max)')
FROM
(
SELECT *
FROM @mockupTable t
FOR XML PATH('row'),TYPE
) A(x)
CROSS APPLY A.x.nodes('/row') B(rw);
The idea in short:
We transform the table to XML. Using SELECT *
will result in a XML with named elements. That's the hook for the generic grip to the column names.
The second thing to know: By default, NULL values will be omited by default within XML.
The result reflects the "filled" columns.
The intermediate XML looks like this (just non-NULL values are represented):
<row>
<ID>1</ID>
<Group>AAA</Group>
<Col1>foo</Col1>
<Col2>bar</Col2>
</row>
<row>
<ID>2</ID>
<Group>AAA</Group>
<Col1>123</Col1>
<Col2>far</Col2>
<Col4>baz</Col4>
</row>
<row>
<ID>3</ID>
<Group>BBB</Group>
</row>
<row>
<ID>4</ID>
<Group>CCC</Group>
<Col1>345</Col1>
<Col2>123</Col2>
</row>
<row>
<ID>5</ID>
<Group>AAA</Group>
<Col3>caz</Col3>
</row>
UPDATE Closer to your expected output
Try this to get ID and Group in your result
SELECT rw.value('(ID/text())[1]','int') ID
,rw.value('(Group/text())[1]','varchar(10)') [Group]
,rw.query('for $n in *[local-name() ne "ID" and local-name() ne "Group"] return concat(",",local-name($n))').value('.','nvarchar(max)')
FROM
(
SELECT *
FROM @mockupTable t
FOR XML PATH('row'),TYPE
) A(x)
CROSS APPLY A.x.nodes('/row') B(rw);
The result
ID Group usedColumns
1 AAA ,Col1 ,Col2
2 AAA ,Col1 ,Col2 ,Col4
3 BBB
4 CCC ,Col1 ,Col2
5 AAA ,Col3
UPDATE 2 Your grouped result
You can try this to get a grouped result fully generically
SELECT C.gr.value('text()[1]','varchar(100)')
,C.gr.query('for $n in row/*[local-name() ne "ID" and local-name() ne "Group"]
return <n>{local-name($n)}</n>')
.query('for $n in distinct-values(n/text())
return concat(",",$n)')
.value('.','nvarchar(max)')
FROM
(
SELECT *
FROM @mockupTable t
FOR XML PATH('row'),TYPE
) A(x)
CROSS APPLY (SELECT A.x.query('for $gr in distinct-values(/row/Group/text())
return <gr>{$gr}{/row[Group=$gr]}</gr>
')) B(gr)
CROSS APPLY B.gr.nodes('/gr') C(gr);
The result
AAA ,Col1 ,Col2 ,Col3 ,Col4
BBB
CCC ,Col1 ,Col2
来源:https://stackoverflow.com/questions/60756756/iterate-through-columns-and-list-all-columns-where-a-record-has-a-value