The query below is based on a complicated view and the view works as I want it to (I\'m not going to include the view because I don\'t think it will help with the question a
Doing a count(distinct)
as a windows function requires a trick. Several levels of tricks, actually.
Because your request is actually truly simple -- the value is always 1 because rx.drugClass is in the partitioning clause -- I will make an assumption. Let's say you want to count the number of unique drug classes per patid.
If so, do a row_number()
partitioned by patid and drugClass. When this is 1, within a patid, , then a new drugClass is starting. Create a flag that is 1 in this case and 0 in all other cases.
Then, you can simply do a sum
with a partitioning clause to get the number of distinct values.
The query (after formatting it so I can read it), looks like:
select rx.patid, d2.fillDate, d2.scriptEndDate, rx.drugName, rx.drugClass,
SUM(IsFirstRowInGroup) over (partition by rx.patid) as NumDrugCount
from (select distinct rx.patid, d2.fillDate, d2.scriptEndDate, rx.drugName, rx.drugClass,
(case when 1 = ROW_NUMBER() over (partition by rx.drugClass, rx.patid order by (select NULL))
then 1 else 0
end) as IsFirstRowInGroup
from (select ROW_NUMBER() over(partition by d.patid order by d.patid,d.uniquedrugsintimeframe desc) as rn,
d.patid, d.fillDate, d.scriptEndDate, d.uniqueDrugsInTimeFrame
from DrugsPerTimeFrame as d
) d2 inner join
rx
on rx.patid = d2.patid inner join
DrugTable dt
on dt.drugClass = rx.drugClass
where d2.rn=1 and rx.fillDate between d2.fillDate and d2.scriptEndDate and
dt.drugClass in ('h3a','h6h','h4b','h2f','h2s','j7c','h2e')
) t
order by patid
I came across this question in search for a solution to my problem of counting distinct values. In searching for an answer I came across this post. See last comment. I've tested it and used the SQL. It works really well for me and I figured that I would provide another solution here.
In summary, using DENSE_RANK()
, with PARTITION BY
the grouped columns, and ORDER BY
both ASC
and DESC
on the columns to count:
DENSE_RANK() OVER (PARTITION BY drugClass ORDER BY drugName ASC) +
DENSE_RANK() OVER (PARTITION BY drugClass ORDER BY drugName DESC) - 1 AS drugCountsInFamilies
I use this as a template for myself.
DENSE_RANK() OVER (PARTITION BY PartitionByFields ORDER BY OrderByFields ASC ) +
DENSE_RANK() OVER (PARTITION BY PartitionByFields ORDER BY OrderByFields DESC) - 1 AS DistinctCount
I hope this helps!
Why would something like this not work?
SELECT
IDCol_1
,IDCol_2
,Count(*) Over(Partition By IDCol_1, IDCol_2 order by IDCol_1) as numDistinct
FROM Table_1