问题
I am new to Haskell and functional programing. I have a .txt
file which contains some paragraphs. I want to count the number of words in each paragraph, using Haskell.
I have written the input/output code
paragraph-words:: String -> int
no_of_words::IO()
no_of_words=
do
putStrLn "enter the .txt file name:"
fileName1<- getLine
text<- readFile fileName1
let wordscount= paragraph-words text
Can anyone help me to write the function paragraph-words. which will calculate the number of words in each paragraph.
回答1:
A list of lines can be split into paragraphs if one takes all lines until at least one empty line (""
) is reached or the list is exhausted (1). We ignore all consecutive empty lines (2) and apply the same method for the rest of our lines:
type Line = String
type Paragraph = [String]
parify :: [Line] -> [Paragraph]
parify [] = []
parify ls
| null first = parify rest
| otherwise = first : parify rest
where first = takeWhile (/= "") ls -- (1) take until newline or end
rest = dropWhile (== "") . drop (length first) $ ls
-- ^ (2) drop all empty lines
In order to split a string into its lines, you can simply use lines
. To get the number of words in a Paragraph
, you simply sum over the number of words in each line
singleParagraphCount :: Paragraph -> Int
singleParagraphCount = sum . map lineWordCount
The words in each line are simply length . words
:
lineWordCount :: Line -> Int
lineWordCount = length . words
So all in all we get the following function:
wordsPerParagraph :: String -> [Int]
wordsPerParagraph = map (singleParagraphCount) . parify . lines
回答2:
First: you don't want to be bothered with dirty IO()
any more than necessary, so the signature should be
wordsPerParagraph :: String -> [Int]
As for doing this: you should first split up the text in paragraphs. Counting the words in each of them is pretty trivial then.
What you basically need is match on empty lines (two adjacent newline characters). So I'd first use the lines
function, giving you a list of lines. Then you separate these, at each empty line:
paragraphs :: String -> [String]
paragraphs = split . lines
where split [] = []
split (ln : "" : lns) = ln : split lns
split (ln : lns) = let (hd, tl) = splitAt 1 $ split lns
in (ln ++ hd) : tl
回答3:
First, you can't use -
in a function name, you would have to use _
instead (or better, use camelCase as leftroundabout suggests below).
Here is a function which satisfies your type signature:
paragraph_words = length . words
This first splits the text into a list of words, then counts them by returning the length of that list of words.
However this does not completely solve the problem because you haven't written code to split your text into paragraphs.
来源:https://stackoverflow.com/questions/20826497/identifying-number-of-words-in-a-paragraph-using-haskell