问题
I am querying a document management database where documents can have many versions. So my query below returns the same document multiple times because of the versioning.
SELECT distinct d.DOCNUM, d.DOCNAME,P.FolderName
FROM DOCUMENT as d with (nolock)
inner join Project_T as pt on d.docnum=pt.item_id
inner join Projects p on pt.prj_id=p.prj_id
where d.type = 'Personal' and d.owner like '%67360'
Results:
DOCNUM | DOCNAME | FOLDERNAME |
-----------+----------+---------------+
123 Article Jonathan
123 Article part1 Jonathan
256 Meeting Notes Jonathan
5697 Memo Jonathan
Ideally docnum 123 should only return once.
I have joined the table that holds the versions for each document and am selecting the latest version (MAX(h.version)
so my query should return unique document numbers per row.
SELECT distinct d.DOCNUM, h.version, d.DOCNAME,P.FolderName
FROM DOCUMENT as d with (nolock)
inner join DOCHISTORY as h on h.DOCNUM = d.docnum
inner join Project_T as pt on d.docnum=pt.item_id
inner join Projects p on pt.prj_id=p.prj_id
where d.type = 'Personal' and d.owner like '%67360'
and d.SECURITY = 'P'
AND h.VERSION = (SELECT MAX(x.version) FROM DOCHISTORY as X
where x.docnum = d.DOCNUM)
but this time I get the latest versions however, where docname is slightly different, it returns the same docnum multiple times. see docnum 123 below.
DOCNUM | Version | DOCNAME | FOLDERNAME |
-----------+--------------+-------------+------------+
123 9 Article Jonathan
123 9 Article part1 Jonathan
256 1 Meeting Notes Jonathan
5697 21 Memo Jonathan
I need to display the docname column in my report. Is there a way aorund this please?
回答1:
Yes, you can select the top 1 docname. Replace your field selection list with this:
SELECT distinct d.DOCNUM
,h.version
,DOCNAME = (SELECT TOP 1 d2.DOCNAME FROM DOCUMENT d2, DOCHISTORY h2
WHERE d2.DOCNUM = d.DOCNUM and h2.DOCNUM = d.docnum)
,P.FolderName
....(rest of code)
In the above output, doc 123 version 9 should appear as Article.
回答2:
What you are trying to do is much easier with window functions. Here is one way:
select DOCNUM, version, DOCNAME, FolderName
from (SELECT d.DOCNUM, h.version, d.DOCNAME, P.FolderName,
max(h.version) over (partition by d.docnum) as maxversion
FROM DOCUMENT d with (nolock)
inner join DOCHISTORY as h on h.DOCNUM = d.docnum
inner join Project_T as pt on d.docnum=pt.item_id
inner join Projects p on pt.prj_id=p.prj_id
where d.type = 'Personal' and d.owner like '%67360' and d.SECURITY = 'P'
) d
where version = maxversion;
The expression max(h.version) over (partition by d.docnum)
is a construct called a window function (or analytic function in some databases). What is does is return the maximum of h.version
for each value of d.docnum
(based on the partition by
clause). This is ANSI standard SQL and most databases support this functionality.
Your version is probably not working because the subquery that calculates the maximum value is not using the same filtering criteria as the outer query.
来源:https://stackoverflow.com/questions/18213088/select-distinct-rows-with-max-value-of-another-column