How to improve performance of this query?

我是研究僧i 提交于 2019-12-24 12:57:34

问题


With reference to SQL Query how to summarize students record by date? I was able to get the report I wanted.

I was told in real world the students table will have 30 Millions of records. I do have index on (StudentID, Date). Any suggestions to improve the performance or is there a better way to build the report ?

Right now I have the following query

;with cte as
(
  select id, 
    studentid,
    date,
    '#'+subject+';'+grade+';'+convert(varchar(10), date, 101) report
  from student
) 
-- insert into studentreport
select distinct 
  studentid,
  STUFF(
         (SELECT cast(t2.report as varchar(50))
          FROM cte t2
          where c.StudentId = t2.StudentId
          order by t2.date desc
          FOR XML PATH (''))
          , 1, 0, '')  AS report
from cte c;

回答1:


Without seeing the execution plan, it's not really possible to write an optimized SQL statement so I'll make suggestions instead.

Don't use a cte as they often don't handle queries with large memory requires well (at least, in my experience). Instead, stage the cte data in a real table, either with a materialized/indexed view or with a working table (maybe a large temp table). Then execute the second select (after the cte) to combine your data in an ordered list.

The number of comments to your question indicates that you have a large problem (or problems). You're converting tall and skinny data (think integers, datetime2 types) into ordered lists within a strings. Try to think instead in terms of storing in the smallest data formats available and manipulating into strings until afterward (or never). Alternatively, give serious thought into creating an XML data field to replace the 'report' field.

If you can make it work, this is what I would do (including a test case without indexes). Your mileage may vary, but give it a try:

create table #student (id int not null, studentid int not null, date datetime not null, subject varchar(40), grade varchar(40))

insert into #student (id,studentid,date,subject,grade)
select 1, 1, getdate(), 'history', 'A-' union all
select 2, 1, dateadd(d,1,getdate()), 'computer science', 'b' union all
select 3, 1, dateadd(d,2,getdate()), 'art', 'q' union all
--
select 1, 2, getdate() , 'something', 'F' union all
select 2, 2, dateadd(d,1,getdate()), 'genetics', 'e' union all
select 3, 2, dateadd(d,2,getdate()), 'art', 'D+' union all
--
select 1, 3, getdate() , 'memory loss', 'A-' union all
select 2, 3, dateadd(d,1,getdate()), 'creative writing', 'A-' union all
select 3, 3, dateadd(d,2,getdate()), 'history of asia 101', 'A-'

go

select      studentid as studentid
            ,(select s2.date as '@date', s2.subject as '@subject', s2.grade as '@grade' 
            from #student s2 where s1.studentid = s2.studentid for xml path('report'), type) as 'reports'
from        (select distinct studentid from #student) s1;

I don't know how to make the output legible on here, but the resultset is 2 fields. Field 1 is an integer, field 2 is XML with one node per report. This still isn't as ideal as just sending the resultset, but it is at least one result per studentid.



来源:https://stackoverflow.com/questions/16425954/how-to-improve-performance-of-this-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!