Haskell - Returning the number of a-Z characters used in a string

问题

I've been using this page on the Haskell website all day and its been really helpful with learning list functions: http://www.haskell.org/haskellwiki/How_to_work_on_lists

My task at the moment is to write a single line statement that returns the number of characters (a-Z) that are used in a string. I can't seem to find any help on the above page or anywhere else on the internet

I know how to count characters in a string by using length nameoflist, but I'm not sure how I would go about counting the number of a-Z characters that have been used, for example 'starT to' should return 6

Any help is appreciated, thanks

回答1:

An alternative to @Sibi's perfectly fine answer is to use a combination of sort and group from Data.List:

numUnique :: Ord a => [a] -> Int
numUnique = length . group . sort

This imposes the tighter restriction of Ord instead of just Eq, but I believe might be somewhat faster since nub is not known for its efficiency. You can also use a very similar function to count the number of each unique element in the list:

elemFrequency :: Ord a => [a] -> [(a, Int)]
elemFrequency = map (\s -> (head s, length s)) . group . sort

Or if you want to use the more elegant Control.Arrow form

elemFrequency = map (head &&& length) . group . sort

It can be used as

> elemFrequency "hello world"
[(' ',1),('d',1),('e',1),('h',1),('l',3),('o',2),('r',1),('w',1)]

回答2:

You can remove the duplicate elements using nub and find the length of the resulting list.

import Data.List (nub)

numL :: Eq a => [a] -> Int
numL xs = length $ nub xs

Demo in ghci:

ghci > numL "starTto"
6

In case you don't want to consider a whitespace in the String, then remove it using a filter or any other appropriate function.

回答3:

There are a few ways to do this, depending on what structure you want to use.

If you want to use Eq structure, you can do this with nub. If the inputs denote a small set of characters, then this is fairly good. However, if there are a lot of distinct alphabetic characters (remember that "Å" and "Ω" are both alphabetic, according to isAlpha), then this technique will have poor performance (quadratic running time).

import Data.Char (isAlpha)
import Data.List (nub)

distinctAlpha :: String -> Int
distinctAlpha = length . nub . filter isAlpha

You can increase performance for larger sets of alphabetic characters by using additional structure. Ord is the first choice, and allows you to use Data.Set, which gives O(N log N) asymptotic performance.

import Data.Char (isAlpha)
import Data.Set (size, fromList)

distinctAlpha :: String -> Int
distinctAlpha = size . fromList . filter isAlpha

回答4:

First, filter the list in order to remove any non a-Z characters; second, remove duplicate elements; third, calculate its length.

import Data.Char (isAlpha)
import Data.List (nub)

count = length . nub . filter isAlpha

回答5:

numberOfCharacters = length . Data.List.nub . filter Data.Char.isAlpha

来源：https://stackoverflow.com/questions/22464226/haskell-returning-the-number-of-a-z-characters-used-in-a-string

标签

string

list

haskell