MSSQL - Making multiple count distinct calls in a query runs slowly

蓝咒 提交于 2019-12-13 01:27:59

问题


I have tables with the following schema:

Device

  • DeviceId
  • Name

Service

  • ServiceId
  • Name

Software

  • SoftwareId
  • Name

Device_Software

  • DeviceId
  • SoftwareId
  • DiscoveryDate

Device_Service

  • DeviceId
  • ServiceId
  • DiscoveryDate

Now, I'm trying to write a query that gives the a Device, and the number of distinct software and services that device has.

If I run the following query I get a result back within 5 seconds (device has 50,000 rows, software and service both have 200 and the link tables include a link for every device to every software and service. Just for testing purposes).

SELECT
  device.name
  ,COUNT(DISTINCT(device_software.softwareId))
FROM
  device
LEFT OUTER JOIN
  device_software ON device.deviceId = device_software.deviceId
GROUP BY device.name

But if I try to expand the query to include the counts for both, it takes much much longer (~30 minutes and still going):

SELECT
  device.name
  ,COUNT(DISTINCT(device_software.softwareId))
  ,COUNT(DISTINCT(device_service.serviceId))
FROM
  device
LEFT OUTER JOIN
  device_service ON device.deviceId = device_service.deviceId
LEFT OUTER JOIN
  device_software ON device.dDeviceId = device_software.deviceId
GROUP BY device.name

Now since this is in a stored procedure, I could just get the two counts individually and combine that, but that seems like a hack. I was wondering if anyone knows of a better way to go about doing this in a single query without having a massive performance hit?


回答1:


I'd try the following and see if it makes difference :

SELECT
device.name
a.cntSft, b.cntSrv
FROM device
LEFT JOIN
 ( SELECT deviceId, COUNT(DISTINCT softwareId) as cntSft FROM device_software 
 GROUP BY deviceId) a (ON a.deviceId = device.deviceId)
LEFT JOIN 
( SELECT deviceId, COUNT(DISTINCT serviceId) as cntSrv FROM device_service 
 GROUP BY deviceId) b (ON b.deviceId = device.deviceId);

You may also not need COUNT DISTINCT, but just COUNT with this version of query.




回答2:


You could consider indexed views on Device_Software and Device_Service:

CREATE VIEW dbo.v_Device_Software
WITH SCHEMABINDING
AS
  SELECT DeviceId, SoftwareId, DeviceCount = COUNT_BIG(*)
    FROM dbo.Device_Software
    GROUP BY DeviceId, SoftwareId;
GO
CREATE UNIQUE CLUSTERED INDEX x ON dbo.v_Device_Software(DeviceId, SoftwareId);
GO

CREATE VIEW dbo.v_Device_Service
WITH SCHEMABINDING
AS
  SELECT DeviceId, ServiceId, DeviceCount = COUNT_BIG(*)
    FROM dbo.Device_Service
    GROUP BY DeviceId, ServiceId;
GO
CREATE UNIQUE CLUSTERED INDEX x ON dbo.v_Device_Service(DeviceId, ServiceId);
GO

Now your query becomes:

SELECT
  device.name
  ,COUNT(vsoft.DeviceId)
  ,COUNT(vserv.DeviceId)
FROM
  dbo.device
LEFT OUTER JOIN dbo.v_Device_Service AS vserv
  ON device.deviceId = vserv.DeviceId
LEFT OUTER JOIN dbo.v_Device_Software AS vsoft
  ON device.deviceId = voft.DeviceId
GROUP BY device.name;

There are many restrictions, though, and you should be sure to test the impact this has on your entire workload, not just this one query.



来源:https://stackoverflow.com/questions/12182704/mssql-making-multiple-count-distinct-calls-in-a-query-runs-slowly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!