identifying number of words in a paragraph using haskell

北城以北 提交于 2019-12-31 07:06:06

问题


I am new to Haskell and functional programing. I have a .txt file which contains some paragraphs. I want to count the number of words in each paragraph, using Haskell.

I have written the input/output code

paragraph-words:: String -> int


no_of_words::IO()
no_of_words=
do
    putStrLn "enter the .txt file name:"
    fileName1<- getLine
    text<- readFile fileName1
            let wordscount= paragraph-words text

Can anyone help me to write the function paragraph-words. which will calculate the number of words in each paragraph.


回答1:


A list of lines can be split into paragraphs if one takes all lines until at least one empty line ("") is reached or the list is exhausted (1). We ignore all consecutive empty lines (2) and apply the same method for the rest of our lines:

type Line      = String
type Paragraph = [String]

parify :: [Line] -> [Paragraph]
parify [] = []
parify ls 
  | null first = parify rest          
  | otherwise  = first : parify rest
  where first = takeWhile (/= "") ls  -- (1) take until newline or end
        rest  = dropWhile (== "") . drop (length first) $ ls
             -- ^ (2) drop all empty lines

In order to split a string into its lines, you can simply use lines. To get the number of words in a Paragraph, you simply sum over the number of words in each line

singleParagraphCount :: Paragraph -> Int
singleParagraphCount = sum . map lineWordCount

The words in each line are simply length . words:

lineWordCount :: Line -> Int
lineWordCount = length . words

So all in all we get the following function:

wordsPerParagraph :: String -> [Int]
wordsPerParagraph = map (singleParagraphCount) . parify . lines



回答2:


First: you don't want to be bothered with dirty IO() any more than necessary, so the signature should be

wordsPerParagraph :: String -> [Int]

As for doing this: you should first split up the text in paragraphs. Counting the words in each of them is pretty trivial then.

What you basically need is match on empty lines (two adjacent newline characters). So I'd first use the lines function, giving you a list of lines. Then you separate these, at each empty line:

paragraphs :: String -> [String]
paragraphs = split . lines
 where split [] = []
       split (ln : "" : lns) = ln : split lns
       split (ln : lns) = let (hd, tl) = splitAt 1 $ split lns
                          in (ln ++ hd) : tl



回答3:


First, you can't use - in a function name, you would have to use _ instead (or better, use camelCase as leftroundabout suggests below).

Here is a function which satisfies your type signature:

paragraph_words = length . words

This first splits the text into a list of words, then counts them by returning the length of that list of words.

However this does not completely solve the problem because you haven't written code to split your text into paragraphs.



来源:https://stackoverflow.com/questions/20826497/identifying-number-of-words-in-a-paragraph-using-haskell

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!