A query that is used to loop through 17 millions records to remove duplicates has been running now for about 16 hours and I wanted to know if the query is sto
I think you need to seriously consider your methodolology. You need to start thinking in sets (although for performance you may need batch processing, but not row by row against a 17 million record table.)
First do all of your records have duplicates? I suspect not, so the first thing you wan to do is limit your processing to only those records which have duplicates. Since this is a large table and you may need to do the deletes in batches over time depending on what other processing is going on, you first pull the records you want to deal with into a table of their own that you then index. You can also use a temp table if you are going to be able to do this all at the same time without ever stopping it other wise create a table in your database and drop at the end.
Something like (Note I didn't write the create index statments, I figure you can look that up yourself):
SELECT min(m.RecordID), m.longitude, m.latitude, m.businessname, m.phone
into #RecordsToKeep
FROM MyTable m
join
(select longitude, latitude, businessname, phone
from MyTable
group by longitude, latitude, businessname, phone
having count(*) >1) a
on a.longitude = m.longitude and a.latitude = m.latitude and
a.businessname = b.businessname and a.phone = b.phone
group by m.longitude, m.latitude, m.businessname, m.phone
ORDER BY CASE WHEN m.webAddress is not null THEN 1 ELSE 2 END,
CASE WHEN m.caption1 is not null THEN 1 ELSE 2 END,
CASE WHEN m.caption2 is not null THEN 1 ELSE 2 END
while (select count(*) from #RecordsToKeep) > 0
begin
select top 1000 *
into #Batch
from #RecordsToKeep
Delete m
from mytable m
join #Batch b
on b.longitude = m.longitude and b.latitude = m.latitude and
b.businessname = b.businessname and b.phone = b.phone
where r.recordid <> b.recordID
Delete r
from #RecordsToKeep r
join #Batch b on r.recordid = b.recordid
end
Delete m
from mytable m
join #RecordsToKeep r
on r.longitude = m.longitude and r.latitude = m.latitude and
r.businessname = b.businessname and r.phone = b.phone
where r.recordid <> m.recordID