Iterate through columns and list all columns where a record has a value

问题

I have the following table:

ID    Group   Col1    Col2    Col3    Col4    ... ColN
------------------------------------------------------
1     AAA     foo     bar
2     AAA     123     far             baz
3     BBB
4     CCC     345     123
5     AAA                     caz

For each Group, I need to find out in what Columns it has a value. I do not care about the values themselves.

Example:

Group AAA has 3 IDs in it: 1, 2, 5. ID 1 has a value in Col1, Col2. ID 2 has a value in Col1, Col2. ID 5 has a value in Col3 so in total, Group AAA has values in Col1, Col2, Col3

The output should then be like this (also listing Groups that have no value for any Column as null:

Group    Cols
------------------------------
AAA      Col1; Col2; Col3;Col4
BBB      null
CCC      Col3

I have hundreds of columns, and hundred of thousands of records.

Can anyone help me get started? I don't know how I can dynamically iterate through all the column names and list them.

回答1:

Due to the OP's comment that they have 100's of columns, this suggests that they need a dynamic Solution. I finished this solution just as the OP commented that they are using 2016, so this will not work on 2016. They OP will need to convert this to the older FOR XML PATH and STUFF method instead of using STRING_AGG.

Other than that, this works:

USE Sandbox;
GO


CREATE TABLE dbo.YourTable (ID int,
                            [Group] char(3),
                            Col1 char(3),
                            Col2 char(3),
                            Col3 char(3),
                            Col4 char(3));
GO

INSERT INTO dbo.YourTable
VALUES(1,'AAA','foo','bar',NULL,NULL),
      (2,'AAA','123','far',NULL,'baz'),
      (3,'BBB',NULL,NULL,NULL,NULL),
      (4,'CCC','345','123',NULL,NULL),
      (5,'AAA',NULL,NULL,'czx',NULL);
GO

--Hard coded example, to get the idea correct first
WITH UnPvt AS(
    SELECT DISTINCT
           YT.[Group],
           V.ColumnName
    FROM dbo.YourTable YT
         CROSS APPLY (VALUES(N'Col1',Col1),
                            (N'Col2',Col2),
                            (N'Col3',Col3),
                            (N'Col4',Col4))V(ColumnName,ColumnValue)
    WHERE V.ColumnValue IS NOT NULL)
SELECT YT.[Group],
       STRING_AGG(U.ColumnName,'; ') WITHIN GROUP (ORDER BY U.ColumnName) AS Cols
FROM (SELECT DISTINCT [Group] FROM dbo.YourTable) YT
      LEFT JOIN UnPvt U ON YT.[Group] = U.[Group]
GROUP BY YT.[group]

GO

--Dynamic Solution
DECLARE @SchemaName sysname = N'dbo',
        @TableName sysname = N'YourTable';

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);
DECLARE @Delimiter nvarchar(50) = N',' + @CRLF + N'                           ';

SET @SQL = N'WITH UnPvt AS(' + @CRLF +
           N'   SELECT DISTINCT' + @CRLF +
           N'          YT.[Group],' + @CRLF +
           N'          V.ColumnName' + @CRLF +
           N'   FROM dbo.YourTable YT' + @CRLF +
           N'        CROSS APPLY (VALUES' + 
           (SELECT STRING_AGG(N'(N' + QUOTENAME(c.[name],'''') + N',' + QUOTENAME(c.[name]) + N')',@Delimiter) WITHIN GROUP (ORDER BY C.[name])
            FROM sys.schemas s
                 JOIN sys.tables t ON s.schema_id = t.schema_id
                 JOIN sys.columns c ON t.object_id = c.OBJECT_ID
            WHERE s.[name] = @SchemaName
              AND t.[name] = @TableName
              AND C.[name] NOT IN (N'ID',N'Group')) + N')V(ColumnName,ColumnValue)' + @CRLF +
           N'    WHERE V.ColumnValue IS NOT NULL)' + @CRLF +
           N'SELECT YT.[Group],' + @CRLF +
           N'       STRING_AGG(U.ColumnName,''; '') WITHIN GROUP (ORDER BY U.ColumnName) AS Cols' + @CRLF +
           N'FROM (SELECT DISTINCT [Group] FROM dbo.YourTable) YT' + @CRLF +
           N'      LEFT JOIN UnPvt U ON YT.[Group] = U.[Group]' + @CRLF +
           N'GROUP BY YT.[group];';

PRINT @SQL;

EXEC sp_executesql @SQL;

GO

DROP TABLE dbo.YourTable;

DB<>Fiddle

Note that this assumes that all columns (apart from ID and Group) have the same data type as well.

Edit: Sigh... FOR XML PATH solution:

DECLARE @SchemaName sysname = N'dbo',
        @TableName sysname = N'YourTable';

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);
DECLARE @Delimiter nvarchar(50) = N',' + @CRLF + N'                           ';

SET @SQL = N'WITH UnPvt AS(' + @CRLF +
           N'   SELECT DISTINCT' + @CRLF +
           N'          YT.[Group],' + @CRLF +
           N'          V.ColumnName' + @CRLF +
           N'   FROM dbo.YourTable YT' + @CRLF +
           N'        CROSS APPLY (VALUES' + 
           STUFF((SELECT @Delimiter + N'(N' + QUOTENAME(c.[name],'''') + N',' + QUOTENAME(c.[name]) + N')'
                  FROM sys.schemas s
                       JOIN sys.tables t ON s.schema_id = t.schema_id
                       JOIN sys.columns c ON t.object_id = c.OBJECT_ID
                  WHERE s.[name] = @SchemaName
                    AND t.[name] = @TableName
                    AND C.[name] NOT IN (N'ID',N'Group')
                  ORDER BY c.[name]
                  FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,DATALENGTH(@Delimiter)/2,N'') + N')V(ColumnName,ColumnValue)' + @CRLF +
           N'    WHERE V.ColumnValue IS NOT NULL)' + @CRLF +
           N'SELECT YT.[Group],' + @CRLF +
           N'       STUFF((SELECT N''; '' + ColumnName' + @CRLF +
           N'              FROM UnPvt U' + @CRLF +
           N'              WHERE U.[Group] = YT.[Group]' + @CRLF +
           N'              ORDER BY U.ColumnName' + @CRLF +
           N'              FOR XML PATH(''''),TYPE).value(''.'',''nvarchar(MAX)''),1,2,N'''') AS Cols' + @CRLF +
           N'FROM (SELECT DISTINCT [Group] FROM dbo.YourTable) YT' + @CRLF +
           N'GROUP BY YT.[group];';

PRINT @SQL;

EXEC sp_executesql @SQL;

回答2:

One method uses concat_ws():

select t.grp,
       concat_ws(',',
                 (case when max(col1) is not null then 'col1' end),
                 (case when max(col2) is not null then 'col2' end),
                 . . .  -- fill in the logic for the rest of the columns
                ) as columns
from t
group by grp;

Note: concat_ws() was introduced in SQL Server 2017. You can do something similar in older versions.

回答3:

Performance will be bad, but this works generically:

DECLARE @mockupTable TABLE (ID INT,[Group] VARCHAR(10),Col1 VARCHAR(10),Col2 VARCHAR(10),Col3 VARCHAR(10),Col4 VARCHAR(10));
INSERT INTO @mockupTable VALUES
 (1,'AAA','foo','bar',NULL,NULL)
,(2,'AAA','123','far',NULL,'baz')
,(3,'BBB',NULL,NULL,NULL,NULL)
,(4,'CCC','345','123',NULL,NULL)
,(5,'AAA',NULL,NULL,'caz',NULL);

--the query

SELECT rw.query('for $n in * return concat(",",local-name($n))').value('.','nvarchar(max)')
FROM
(
SELECT * 
FROM @mockupTable t
FOR XML PATH('row'),TYPE
) A(x)
CROSS APPLY A.x.nodes('/row') B(rw);

The idea in short:

We transform the table to XML. Using SELECT * will result in a XML with named elements. That's the hook for the generic grip to the column names.

The second thing to know: By default, NULL values will be omited by default within XML.

The result reflects the "filled" columns.

The intermediate XML looks like this (just non-NULL values are represented):

<row>
  <ID>1</ID>
  <Group>AAA</Group>
  <Col1>foo</Col1>
  <Col2>bar</Col2>
</row>
<row>
  <ID>2</ID>
  <Group>AAA</Group>
  <Col1>123</Col1>
  <Col2>far</Col2>
  <Col4>baz</Col4>
</row>
<row>
  <ID>3</ID>
  <Group>BBB</Group>
</row>
<row>
  <ID>4</ID>
  <Group>CCC</Group>
  <Col1>345</Col1>
  <Col2>123</Col2>
</row>
<row>
  <ID>5</ID>
  <Group>AAA</Group>
  <Col3>caz</Col3>
</row>

UPDATE Closer to your expected output

Try this to get ID and Group in your result

SELECT  rw.value('(ID/text())[1]','int') ID
       ,rw.value('(Group/text())[1]','varchar(10)') [Group]
       ,rw.query('for $n in *[local-name() ne "ID" and local-name() ne "Group"] return concat(",",local-name($n))').value('.','nvarchar(max)')
FROM
(
SELECT * 
FROM @mockupTable t
FOR XML PATH('row'),TYPE
) A(x)
CROSS APPLY A.x.nodes('/row') B(rw);

The result

ID  Group   usedColumns
1   AAA     ,Col1 ,Col2
2   AAA     ,Col1 ,Col2 ,Col4
3   BBB 
4   CCC     ,Col1 ,Col2
5   AAA     ,Col3

UPDATE 2 Your grouped result

You can try this to get a grouped result fully generically

SELECT C.gr.value('text()[1]','varchar(100)')
      ,C.gr.query('for $n in row/*[local-name() ne "ID" and local-name() ne "Group"] 
                   return <n>{local-name($n)}</n>')
           .query('for $n in distinct-values(n/text()) 
                   return concat(",",$n)')
           .value('.','nvarchar(max)')
FROM
(
SELECT * 
FROM @mockupTable t
FOR XML PATH('row'),TYPE
) A(x)
CROSS APPLY (SELECT A.x.query('for $gr in distinct-values(/row/Group/text())
                               return <gr>{$gr}{/row[Group=$gr]}</gr>
                               ')) B(gr)
CROSS APPLY B.gr.nodes('/gr') C(gr);

The result

AAA     ,Col1 ,Col2 ,Col3 ,Col4
BBB     
CCC     ,Col1 ,Col2

来源：https://stackoverflow.com/questions/60756756/iterate-through-columns-and-list-all-columns-where-a-record-has-a-value

标签

sql

sql-server

tsql

sql-server-2016