问题
I have a table of URL redirects in a SQL server table, each redirect has an ID, a FromURL and a ToURL field.
I've been asked to find where we have a chain of redirects in the table so that we can replace them with a single redirect so that users are only redirected once rather than multiple times.
An example of the table is below:

As you can see, if a user visits URL A, they will be redirected to B, then from B to C then from C to D we'd like to replace this with a single redirect from A to D to speed up the page load.
I thought I might be able to do this without cursors with a recursive CTE but I got completely stuck with this, the best I managed to to was find the start of each chain with the following:
SELECT r.ID ,
r.FromURL ,
r.ToURL
FROM dbo.redirect r
WHERE fromURL NOT IN ( SELECT ToURL
FROM dbo.redirect r2 )
This gives me the start of the chains (or the ones that aren't in a chain at all) by selecting any records where the FromURL hasn't been redirected by any other redirect. When I tried following through some of the recursive CTE examples, all I ended up with was junk data or hitting the recursion limit.
Ideally what I'd like to get out of this is data similar to the following:

As you can see, the chains of redirects have been replaced with a single one, so every level in the hierarchy now goes directly to the end of the chain.
I'm just a DBA who agreed to do something for our web team that I have now found completely out of my ability with T-SQL so if anyone can help me out that would be most appreciated.
回答1:
The general solution can be found searching for: "Directed Acyclic Graph", "Traversal", "SQL". hansolav.net/sql/graphs.html#topologicalsorting has some good info.
If you need a fast answer, here's a quick-and-dirty method. It's not efficient, and it needs an acyclic input, but it's readable to someone not familiar with sql.
SELECT id, FromUrl, ToUrl
INTO #temp
FROM dbo.redirect
WHILE @@ROWCOUNT > 0
BEGIN
UPDATE cur
SET ToUrl = nxt.ToURL
FROM #temp cur
INNER JOIN #temp nxt ON (cur.ToURL = nxt.FromURL)
END
SELECT * FROM #temp
Alternatively, with a recursive CTE:
;WITH cte AS (
SELECT 1 as redirect_count, id, FromURL, ToUrl
FROM dbo.redirect
UNION ALL
SELECT redirect_count + 1, cur.id, cur.FromURL, nxt.ToURL
FROM cte cur
INNER JOIN @t nxt ON (cur.ToURL = nxt.FromURL)
)
SELECT
t1.id, t2.FromUrl, t2.ToUrl
FROM dbo.redirect t1
CROSS APPLY (
SELECT TOP 1 FromUrl, ToUrl
FROM cte
WHERE id = t1.id
ORDER BY redirect_count DESC
) t2
来源:https://stackoverflow.com/questions/21117854/find-the-start-and-end-of-a-redirect-chain