Create list of paths taken

问题

Given range (a,b) and lines (x,y), I want to construct all the possible ways to cover the range with the given lines.

For example with range (0,10) (if we filter list to be within range then we don't have to worry about it) and the following list (sorting it makes easier to pick next value),

list = [(0,1), (1,10), (1,4), (3,5), (5,10)]

I want to output list of paths taken to cover the range as follows,

[
[(0,1), (1,4), (3,5), (5,10)],
[(0,1), (1,10)]
]

I tried setting up function that would get list of next possible (x,y) values as follows, but it only prints a single path.

-- assume list is sorted based on first pair
nextpaths :: (Num a, Ord a) => [(a, a)] -> ([(a, a)], [(a, a)])
nextpaths ((start, end):xs) = go xs ([], [])
  where go [] acc = acc
        go (y:ys) (next, rest)| fst y <= end = go ys (y:next, rest)
                              | otherwise = (next, y:ys)

paths t@(x:xs) = case nextpaths t of
  ([],_) -> [[x]]
  (n:next, rest) -> map (x:) (paths (n:rest))

How would we make it so that paths functions applies to other next list values?

回答1:

We can generate a list of minimal paths: paths where we can not remove a single 2-tuple such that it is still a valid path.

Usually it is more efficient here to work with a sorted list of fragments, since we can scan the list and append items that are necessary. When we scan, we will need two things: the range we want to cover; and the last range, such that we guarantee minimality.

We will first construct a function where we assume we already selected a path. We thus can define a function with signature:

paths1 :: Ord a => (a, a) -> (a, a) -> [(a, a)] -> [[(a, a)]]

In case the last item selected is greater than or equal to the upperbound of the range, we are done. In that case, we return a singleton list with an empty list. The recursive call can then add the selected subpath to the list:

paths1 (a, f) (b, c) _ | c >= f = [[]]

In case the list of possible subranges is exhausted, we can not generate such path, we thus return an empty list in case the list of subranges is empty:

paths1 _ _ [] = []

In case we have not reached the end yet, we will need an extra subrange. Such subrange needs to satisfy two criteria: it should start after the previously selected subpath, and it should end after the previously selected subpath. We thus can skip suranges that do not satisfy that condition:

paths1 r s@(b, c) ((d, e):xs) | d > c = []
                              | d <= b || e <= c = paths1 r s xs

In case we can select the subrange, we thus can pick that one. In that case we thus update the last range selected and will the prepend all the paths that are returned:

paths1 r s@(_,sb) (x@(_, xb):xs) = map (x:) (paths1 r (sb,xb) xs) ++ paths1 r s xs

Now we thus have defined a complete implementation for paths1:

paths1 :: Ord a => (a, a) -> (a, a) -> [(a, a)] -> [[(a, a)]]
paths1 (a, f) (b, c) _ | c >= f = [[]]
paths1 _ _ [] = []
paths1 r s@(b, c) ((d, e):xs) | d > c = []
                              | d <= b || e <= c = paths1 r s xs
paths1 r s@(_,sb) (x@(_, xb):xs) = map (x:) (paths1 r (sb,xb) xs) ++ paths1 r s xs

We now need to implement a function that selects the first subrange. We can implement such function path0:

paths0 :: (a, a) -> [(a, a)] -> [[(a, a)]]

The first range we should select should start before on on the start of the range we want to generate, and after the start of the range. We thus can implement that as:

paths0 :: Ord a => (a, a) -> [(a, a)] -> [[(a, a)]]
paths0 (a, _) ((b, c):_) | b > a || c <= a = []
paths0 r@(a, _) ((_, c):xs) | c <= a = paths0 r xs
paths0 r (x:xs) = map (x:) (paths1 r x xs) ++ paths0 r xs

So now we can combine the two in a path function. We can first sort the list, or add this as a pre-condition:

import Data.List(sort)

paths :: (a, a) -> [(a, a)] -> [[(a, a)]]
paths = (. sort) . paths0

We then obtain the expected result:

Prelude Data.List> paths (0,10) [(0,1), (1,10), (1,4), (3,5), (5,10)]
[[(0,1),(1,4),(3,5),(5,10)],[(0,1),(1,10)]]

The above is not the most elegant solution. I leave "polishing" it further as an exercise.

回答2:

This is actually a problem of some depth.

Or, rather, the algorithm you ask for is simple (if approached with the right tools at hand); but checking whether it is correct is not, and it is very easy to make a slight mistake. This is because intervals are unlike numbers in that there is no such simple notion as the usual total order anymore, and what relations we have instead are tenfold more complex — too far so for the unarmed human mind to grasp.

Therefore, what our goals should be?

We need to understand how intervals relate to each other.
We need to be able to check if a given set of intervals is a solution to the problem.

In this writing, I will be saying "base" meaning the interval to be covered, and "chain" consisting of "links" meaning a set of intervals that may be covering it. (I will eventually justify this latter naming.)

So, let us arm ourselves.

With numbers (that is, single points), there are only 3 disjoint qualitative relations: a < b or a = b or a > b. What can we say about pairs of numbers (representing intervals) then? There are 5 places a point can be with respect to an interval:

             on the left end
             v
-- before -- * == inside == * -- after --
                            ^
                            on the right end

Considering that the left end of an interval is never to the right of its right end (duh), this gives us sum [5, 4.. 1] = 15 disjoint qualitative relations between two intervals. Disregarding the two relations where both ends of one interval are on the same end of another (meaning the interval is a point), that gives 13. And now there is a prior art discussing exactly 13 disjoint exhaustive relations on intervals. (Original article.)

Namely, there are defined these 6 relations:

precedes      =  \ i j  ->  right i < left j
meets         =  \ i j  ->  right i == left j && left i /= left j && right i /= right j
overlaps      =  \ i j  ->  left i < left j  && right i < right j && right i > left j
isFinishedBy  =  \ i j  ->  left i < left j  && right i == right j
contains      =  \ i j  ->  left i < left j  && right i > right j
starts        =  \ i j  ->  left i == left j && right i < right j

— Together with their inversions flip ... and the equality relation.

Whereas for numbers we can derive exactly 8 composite relations in terms of the 3 basic ones (considering relations as a vector space over the binary field), on intervals we can define about 8 thousand. Some of those will be of use to us within this problem:

absorbs         =  isFinishedBy `or` contains `or` flip starts `or` (==)
isDisjointWith  =  precedes     `or` flip precedes
joins           =  (fmap . fmap) not isDisjointWith
touches         =  meets        `or` overlaps
isRightwardsOf  =  flip (precedes `or` touches)
...

Given these relations, we can manipulate them to obtain all kinds of awesome devices, such as closures, equivalences and orders. I will presently use some to obtain a verifier of solutions to our problem.

A reflexive, symmetric and transitive closure of joins is an equivalence under which considered equivalent are those intervals that belong to a contiguous line. (While not necessarily being adjacent on that line.)
A normal set of intervals is such in which all intervals are disjoint.
- Any set may be normalized by gluing together intervals that join until none are left.
- Normalization preserves coverage: exactly when a point belongs to some of the intervals in a set, it will belong to some interval in its normalization.
A solution is a chain such that:
- Its normalization is a singleton set whose only member absorbs the base. (Sufficient.)
- With any link removed, this condition does not anymore hold. (Minimal.)

Therefore, normalize is a function that divides a set of intervals into classes of equivalence induced by joins and converts each class to an interval by taking the extrema of all the endpoints.

relation :: Ord a => Set a -> (a -> a -> Bool) -> Relation a

closure :: Relation a -> Relation a

classifyBy :: Ord a => (a -> a -> Bool) -> Set a -> Set (Set a)

(?) :: Eq a => Relation a -> (a, a) -> Bool

bounds :: Ord a => Set a -> Interval a

flatten :: Ord a => Set (Interval a) -> Set a

normalize :: Ord a => Set (Interval a) -> Set (Interval a)
normalize u | Set.null u = Set.empty
            | otherwise = let  rel = closure (relation u joins)
                               classes = classifyBy (curry (rel ?)) u
                          in Set.map (bounds . flatten) classes

In these terms, we can define the check:

isCovering :: Ord a => Interval a -> [Interval a] -> Bool
isCovering base xs = case (Set.toList . normalize . Set.fromList) xs of
                        [y] -> y `absorbs` base
                        _   -> False

isMinimalCovering :: Ord a => Interval a -> [Interval a] -> Bool
isMinimalCovering base xs = sufficient && minimal
    where sufficient = isCovering base xs
          minimal    = List.null . filter (isCovering base)
                                 . fmap (`deleteAt` xs) $ [0.. length xs - 1]

Not only that, we can define a filter:

bruteForceCoveringChains :: forall a. (Ord a, Num a)
                         => Interval a -> [Interval a] -> [[Interval a]]
bruteForceCoveringChains base xs = filter (isMinimalCovering base) (List.subsequences xs)

Time complexity of these devices is crazy. Empirically, this brute force solution can munch through a set of 10 intervals, but not 20. But this much is enough to check a candidate fast algorithm against.

Onwards now!

All the links in our chain must connect, like... links of a chain. One after the other. There is a relation for that: the one I named touches. If a series of intervals consecutively touch one another, we are certain that they cover the space from the beginning of the first to the ending of the last one. We can use this relation to consecutively filter more and more links into our chain until it subsumes the base completely.

Incidentally, touches is an antisymmetric relation, which makes its transitive and reflexive closure an ordering of intervals, and a chain in order theory is exactly a totally ordered set. So, our naming is justified: there is a relation that is not a total ordering for arbitrary sets of intervals, but is a total ordering for our chains.

This is not enough though: we must also ensure our chain is minimal. I claim that this condition holds exactly when touches is nowhere transitive on our chain. That means: when x `touches` y and y `touches` z, it is never that x `touches` z (Otherwise, we would not need y in our chain.). Observe that, like links in a real chain, our "links" must only overlap two at a time. This requirement may be paraphrased in terms of interval relations: a link must be touched by the interval between the end of the previous link and the one before the previous. It sounds a bit baroque, but I am sure the reader may depict this situation in their mind or on a piece of paper.

And this is all that is needed to give a recursive definition that we are looking for.

chainsFromTo :: Ord a => Interval a -> Interval a -> [Interval a] -> [[Interval a]]
chainsFromTo start end xs' = case base of
    Point _ -> (fmap pure . filter (`absorbs` base)) xs'
    _       -> baseCase ++ recursiveCase
  where
    base = right start ~~ left end

    xs = filter (not . isDisjointWith base) xs'

    baseCase = do
        x <- filter ((start `touches`) * (`touches` end)) xs
        return [x]

    recursiveCase = do
        x <- filter ((start `touches`) * not . (`touches` end)) xs
        xs <- chainsFromTo (right start ~~ right x) end (filter (`isRightwardsOf` x) xs)
        return $ x: xs

coveringChainsFromTo :: forall a. (Ord a, Num a)
                     => Interval a -> [Interval a] -> [[Interval a]]
coveringChainsFromTo _   [ ] = [ ]
coveringChainsFromTo base xs = chainsFromTo start end xs
  where
    start = (\z -> z - 1) (left reach) ~~ left base
    end = right base ~~ (\z -> z + 1) (right reach)
    reach = (bounds . flatten . Set.fromList) xs

Once you have it, it looks straightforward, but I tried like a dozen times to make it right, and only extensive checking helped me locate and fix all the corner cases. You can see the complete code in a repository.

That is it.

I hope it helps. Do comment if my presentation is not clear or if I missed something.

来源：https://stackoverflow.com/questions/57703774/create-list-of-paths-taken

标签

haskell