Random-Pivot Quicksort in Haskell

我们两清 提交于 2019-12-13 13:32:56

问题


Is it possible to implement a quicksort in Haskell (with RANDOM-PIVOT) that still has a simple Ord a => [a]->[a] signature?

I'm starting to understand Monads, and, for now, I'm kind of interpreting monads as somethink like a 'command pattern', which works great for IO.

So, I understand that a function that returns a random number should actually return a monadic value like IO, because, otherwise, it would break referential transparency. I also understand that there should be no way to 'extract' the random integer from the returned monadic value, because, otherwise, it would, again, break referential transparency.

But yet, I still think that it should be possible to implement a 'pure' [a]->[a] quicksort function, even if it uses random pivot, because, it IS referential transparent. From my point of view, the random pivot is just a implementation detail, and shouldn't change the function's signature

OBS: I'm not actually interested in the specific quicksort problem (so, I don't want to sound rude but I'm not looking for "use mergesort" or "random pivot doesn't increase performance in practice" kind of answers) I'm actually interested in how to implement a 'pure' function that uses 'impure' functions inside it, in cases like quicksort, where I can assure that the function actually is a pure one.

Quicksort is just a good example.


回答1:


In such cases, where you know that the function is referentially transparent, but you can't proof it to the compiler, you may use the function unsafePerformIO :: IO a -> a from the module Data.Unsafe.

For instance, you may use unsafePerformIO to get an initial random state and then do anything using just this state.

But please notice: Don't use it if it's not really needed. And even then, think twice about it. unsafePerformIO is somewhat the root of all evil, since it's consequences can be dramatical - anything is possible from coercing different types to crashing the RTS using this function.




回答2:


You are making a false assumption that picking the pivot point is just an implementation detail. Consider a partial ordering on a set. Like a quicksort on cards where

card a < card b if the face value is less but if you were to evaluate booleans:

  4 spades < 4 hearts (false)
  4 hearts < 4 spades (false)
  4 hearts = 4 spades (false)

In that case the choice of pivots would determine the final ordering of the cards. In precisely the same way

for a function like

a = get random integer  
b = a + 3
print b 

is determined by a. If you are randomly choosing something then your computation is or could be non deterministic.




回答3:


OK, check this out.

Select portions copied form the hashable package, and voodoo magic language pragmas

{-# LANGUAGE FlexibleInstances, UndecidableInstances, NoMonomorphismRestriction, OverlappingInstances #-}

import System.Random (mkStdGen, next, split)
import Data.List (foldl')
import Data.Bits (shiftL, xor)

class Hashable a where
    hash :: a -> Int

instance (Integral a) => Hashable a where
    hash = fromIntegral

instance Hashable Char where
    hash = fromEnum

instance (Hashable a) => Hashable [a] where
    hash = foldl' combine 0 . map hash

-- ask the authors of the hashable package about this if interested
combine h1 h2 = (h1 + h1 `shiftL` 5) `xor` h2

OK, so now we can take a list of anything Hashable and turn it into an Int. I've provided Char and Integral a instances here, more and better instances are in the hashable packge, which also allows salting and stuff.

This is all just so we can make a number generator.

genFromHashable = mkStdGen . hash

So now the fun part. Let's write a function that takes a random number generator, a comparator function, and a list. Then we'll sort the list by consulting the generator to select a pivot, and the comparator to partition the list.

qSortByGen _ _ [] = []
qSortByGen g f xs = qSortByGen g'' f l ++ mid ++ qSortByGen g''' f r
    where (l, mid, r) = partition (`f` pivot) xs
          pivot = xs !! (pivotLoc `mod` length xs)
          (pivotLoc, g') = next g
          (g'', g''') = split g'

partition f = foldl' step ([],[],[])
    where step (l,mid,r) x = case f x of
              LT -> (x:l,mid,r)
              EQ -> (l,x:mid,r)
              GT -> (l,mid,x:r)

Library functions: next grabs an Int from the generator, and produces a new generator. split forks the generator into two distinct generators.

My functions: partition uses f :: a -> Ordering to partition the list into three lists. If you know folds, it should be quite clear. (Note that it does not preserve the initial ordering of the elements in the sublists; it reverses them. Using a foldr could remedy this were it an issue.) qSortByGen works just like I said before: consult the generator for the pivot, partition the list, fork the generator for use in the two recursive calls, recursively sort the left and right sides, and concatenate it all together.

Convenience functions are easy to compose from here

qSortBy f xs = qSortByGen (genFromHashable xs) f xs
qSort = qSortBy compare

Notice the final function's signature.

ghci> :t qSort
qSort :: (Ord a, Hashable a) => [a] -> [a]

The type inside the list must implement both Hashable and Ord. There's the "pure" function you were asking for, with one logical added requirement. The more general functions are less restrictive in their requirements.

ghci> :t qSortBy
qSortBy :: (Hashable a) => (a -> a -> Ordering) -> [a] -> [a]
ghci> :t qSortByGen
qSortByGen
  :: (System.Random.RandomGen t) =>
     t -> (a -> a -> Ordering) -> [a] -> [a]

Final notes

qSort will behave exactly the same way for all inputs. The "random" pivot selection is. in fact, deterministic. But it is obscured by hashing the list and then seeding a random number generator, making it "random" enough for me. ;)

qSort also only works for lists with length less than maxBound :: Int, which ghci tells me is 9,223,372,036,854,775,807. I thought there would be an issue with negative indexes, but in my ad-hoc testing I haven't run into it yet.


Or, you can just live with the IO monad for "truer" randomness.

qSortIO xs = do g <- getStdGen -- add getStdGen to your imports
                return $ qSortByGen g compare xs


ghci> :t qSortIO
qSortIO :: (Ord a) => [a] -> IO [a]
ghci> qSortIO "Hello world"
" Hdellloorw"
ghci> qSort "Hello world"
" Hdellloorw"



回答4:


Haskell provides the ST monad to perform non-referentially-transparent actions with a referentially transparent result.

Note that it doesn't enforce referential transparency; it just insures that potentially non-referentially-transparent temporary state can't leak out. Nothing can prevent you from returning manipulated pure input data that was rearranged in a non-reproducible way. Best is to implement the same thing in both ST and pure ways and use QuickCheck to compare them on random inputs.



来源:https://stackoverflow.com/questions/5237775/random-pivot-quicksort-in-haskell

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!