问题
import Data.List
data Encoding = Multiple Int Char | Single Char deriving (Eq,Show,Ord)
Encoding of run length
encode :: String -> [Encoding]
encode inputString =encoding (group inputString) []
encoding :: [String] -> [Encoding] -> [Encoding]
encoding groupString xs=
if (length groupString == 0)
then xs
else
case (head groupString) of
([c]) ->encoding (tail groupString) (xs ++ [Single c])
(x) -> encoding (tail groupString) (xs ++ [Multiple (length x) (head x)])
Decoding of run length
decode :: [Encoding] -> String
decode listString = decoding listString []
decoding :: [Encoding] -> String -> String
decoding inputString xs=
if (length inputString == 0)
then xs
else
case (head inputString) of
(Single x) ->decoding (tail inputString) (xs++ [x])
(Multiple num x) ->decoding (tail inputString) (xs ++ (replicate num x) )
This is my run length encoding solution, can anyone suggest me a better way to write encoding and decoding functions
回答1:
A lot of your code is dedicated to updated an accumulator. You add elements to the tail of that accumulator, and this will have a drastic impact on performance.
The reason this is usually not very efficient is because a list in Haskell [a]
is implemented - at least conceptually - as a linked list. As a result concatenating two lists l1
and l2
together with l1 ++ l2
, will take O(|l1|) time (that is the number of elements in l1
). So that means that if the list already contains 100 elements, adding one extra at the end will require a lot of work.
Another anti-pattern is the use of head
and tail
. Yes these functions can be used, but unfortunately since you do not use pattern matching, it could happen that an empty list is passed, and then head
and tail
will error.
Another problem here is that you use length
on a list. Since sometimes lists in Haskell can have infinite length, the length
call will - if we need it - never ends.
In cases where you have to append at the end of the accumulator, usually we can write the recursion in the tail of a list "cons" we are building. So we can rewrite our program from:
encode :: String -> [Encoding]
encode [] = []
encode (h:t) = ...
The question is now how we can count these values. We can use span :: (a -> Bool) -> [a] -> ([a],[a]), a function that will - for a given predicate and a list - construct a 2-tuple where the first item contains the prefix of the list where all elements satisfy the predicate, and the second item contains the rest of the list, so we can use this like:
encode :: String -> [Encoding]
encode [] = []
encode (h:t) | nh > 1 = Multiple nh h : tl
| otherwise = Single h : tl
where (s1, s2) = span (h ==) t
nh = 1 + length s1
tl = encode s2
For example:
Prelude Data.List> encode "Foobaaar qqquuux"
[Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']
For the decoding, we again do not need an accumulator, and can make use of replicate :: Int -> a -> [a]:
decode :: [Encoding] -> String
decode [] = []
decode (Single h:t) = h : decode t
decode (Multiple nh h) = replicate nh h ++ decode t
Or we can use a list monad instead:
decode :: [Encoding] -> String
decode = (>>= f)
where f (Single h) = [h]
f (Multiple nh h) = replicate nh h
For example:
Prelude Data.List> decode [Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']
"Foobaaar qqquuux"
回答2:
As an extension to Willem Van Onsem's excellent answer, consider creating the run lengths separately, then combining them together with their letters with zipWith
.
Data.List
has the function group (which itself is a specialization of the generic groupBy
; group = groupby (==)
) which breaks strings into homogenous substrings. i.e.:
group "aaabbccccd"
= ["aaa", "bb", "cccc", "d"]
Calculating the length of each will give you run-length.
N.B. that group
is implemented exactly the same way as Willem's span
solution.
import Data.List (group)
data Encoding = Multiple Int Char
| Single Char
deriving (Eq, Show, Ord)
encode :: String -> [Encoding]
encode xs = zipWith op lengths letters
where
groups = group xs
lengths = map length groups
letters = map head groups
op :: Int -> Char -> Encoding
op 1 = Single
op n = Multiple n
This could also be done as one very ugly list comprehension.
encode xs = [ let (n, c) = (length g, head g)
in if n == 1
then Single c
else Multiple n c
| g <- group xs ]
回答3:
Your
encoding
function is a functional map. Instead of making your own primitive recursive function just usemap
.Your encoding is reversing the output (
xs ++ [Single c]
etc) which is both counter intuitive and computationally expensive. Stop it.Too many parenthesis, such as on
case (..) of
,if (...) then
and the arms of the case(...) ->
. All these are unneeded and clutter the code.If you type
head
chances are you should have pattern matched somewhere instead.
Consider:
encoding :: String -> [Encoding]
encoding = map enc . group -- Point 1, use map which also takes
-- care of point 2 and 3.
where
enc [x] = Single x
enc xs@(x:_) = Multiple (length xs) x -- Point 4, patterns not `head`
-- Here consider make pattern matches total either via an error call or Maybe type
来源:https://stackoverflow.com/questions/48669774/run-length-encoding-in-haskell