问题
Apologies for a rookie question. I'm trying to change my mental paradigm from procedural to functional.
For instance, suppose I have a list of names that I want to print like this "John, Paul, George, and Ringo." But this code does not satisfy:
let names = [ "John"; "Paul"; "George"; "Ringo" ]
names |> Seq.iter (fun s -> printf "%s, " s)
My procedural instinct is to seek a way to insinuate a predicate into that lambda so that it can branch between ", " or ", and " or ". " depending upon where we're at iterating the sequence. I think that's wrong, but I'm feeling around for what's right.
Would it be better to split the sequence in parts?
In this case it seems that we want to split the sequence into parts corresponding to distinct delimiter behaviors. We want to split it at the end, so we can't use Seq. But we can use List.splitAt instead.
let start, ending = List.splitAt (names.Length - 1) names
let penultimate, last = List.splitAt 1 ending
start |> Seq.iter (fun s -> printf "%s, " s)
penultimate |> Seq.iter (fun s -> printf "%s, and " s)
last |> Seq.iter (fun s -> printf "%s. " s)
Is this a righteous approach? Is there a better solution I've overlooked? Am I thinking along the right lines?
回答1:
The general approach I take to tackle these kind of problems is to split them into smaller parts and solve individually:
- an empty list
[]
results in""
- one element
["a"]
results in"a."
- two elements
[ "a"; "b" ]
result in"a and b."
- more elements (that is
a :: rest
) result in"a, " + takeCareOf rest
, wheretakeCareOf
follows above rules. Note that we don't need to know the length of the full list.
Above recipe directly translates to F# (and functional languages in general):
let rec commaAndDot' = function
| [] -> ()
| [ a ] -> printfn "%s." a
| a :: [ b ] -> printfn "%s and %s." a b
| a :: rest -> printf "%s, " a; commaAndDot' rest
Are we done yet? No, commaAndDot'
violates the Single Responsibility Principle because the function implements our 'business logic' and prints to the console. Let's fix that:
let rec commaAndDot'' = function
| [] -> ""
| [ a ] -> sprintf "%s." a
| a :: [ b ] -> sprintf "%s and %s." a b
| a :: rest -> sprintf "%s, " a + commaAndDot'' rest
As an additional benefit we can now call the function in parallel and the output does not get mixed up anymore.
Are we done yet? No, above function is not tail-recursive (we need to compute commaAndDot'' rest
before concatenating it to the current result) and would blow the stack for large lists. A standard approach to fixing this is to introduce an accumulator acc
:
let commaAndDot''' words =
let rec helper acc = function
| [] -> acc
| [ a ] -> sprintf "%s%s." acc a
| a :: [ b ] -> sprintf "%s%s and %s." acc a b
| a :: rest -> helper (acc + sprintf "%s, " a) rest
helper "" words
Are we done yet? No, commaAndDot'''
creates a lot of strings for intermediate results. Thanks to F# not being a pure language, we can leverage local (private, non-observable) mutation to optimize for memory and speed:
let commaAndDot words =
let sb = System.Text.StringBuilder()
let rec helper = function
| [] -> sb
| [ a ] -> sprintf "%s." a |> sb.Append
| a :: [ b ] -> sprintf "%s and %s." a b |> sb.Append
| a :: rest ->
sprintf "%s, " a |> sb.Append |> ignore
helper rest
helper words |> string
Are we done yet? Probably... at least this is something I would consider idiomatic F# and happily commit. For optimising further (e.g. Append
ing commas and dots separately or changing the order of the patterns) I'd first write micro-benchmarks before sacrificing readability.
All versions generate the same output:
commaAndDot [] // ""
commaAndDot [ "foo" ] // "foo."
commaAndDot [ "foo"; "bar" ] // "foo and bar."
commaAndDot [ "Hello"; "World"; "F#" ] // "Hello, World and F#."
Update: SCNR, created a benchmark... results are below as a HTML snippet (for nice tabular data).
BuilderOpt is the StringBuilder version with the []
case moved to the bottom,
BuilderChained is with chained Append calls, e.g. sb.Append(a).Append(" and ").Append(b)
and BuilderFormat is e.g. sb.AppendFormat("{0} and {1}", a, b)
. Full source code available.
As expected, 'simpler' versions perform better for small lists, the larger the list the better BuilderChained. Concat performs better than I expected but does not produce the right output (missing ".", lacking one case). Yield gets rather slow...
<!DOCTYPE html>
<html lang='en'>
<head>
<meta charset='utf-8' />
<title>Benchmark.CommaAndDot</title>
<style type="text/css">
table { border-collapse: collapse; display: block; width: 100%; overflow: auto; }
td, th { padding: 6px 13px; border: 1px solid #ddd; }
tr { background-color: #fff; border-top: 1px solid #ccc; }
tr:nth-child(even) { background: #f8f8f8; }
</style>
</head>
<body>
<pre><code>
BenchmarkDotNet=v0.11.1, OS=Windows 10.0.16299.726 (1709/FallCreatorsUpdate/Redstone3)
Intel Core i7 CPU 950 3.07GHz (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2998521 Hz, Resolution=333.4977 ns, Timer=TSC
[Host] : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit LegacyJIT-v4.7.3190.0 DEBUG
DefaultJob : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3190.0
</code></pre>
<pre><code></code></pre>
<table>
<thead><tr><th> Method</th><th>Verbosity</th><th> Mean</th><th>Error</th><th>StdDev</th><th> Median</th><th>Scaled</th><th>ScaledSD</th>
</tr>
</thead><tbody><tr><td>Concat</td><td>0</td><td>39.905 ns</td><td>0.0592 ns</td><td>0.0494 ns</td><td>39.906 ns</td><td>1.02</td><td>0.11</td>
</tr><tr><td>Yield</td><td>0</td><td>27.235 ns</td><td>0.0772 ns</td><td>0.0603 ns</td><td>27.227 ns</td><td>0.69</td><td>0.07</td>
</tr><tr><td>Accumulator</td><td>0</td><td>1.956 ns</td><td>0.0109 ns</td><td>0.0096 ns</td><td>1.954 ns</td><td>0.05</td><td>0.01</td>
</tr><tr><td>Builder</td><td>0</td><td>32.384 ns</td><td>0.2986 ns</td><td>0.2331 ns</td><td>32.317 ns</td><td>0.82</td><td>0.09</td>
</tr><tr><td>BuilderOpt</td><td>0</td><td>33.664 ns</td><td>1.0371 ns</td><td>0.9194 ns</td><td>33.402 ns</td><td>0.86</td><td>0.09</td>
</tr><tr><td>BuilderChained</td><td>0</td><td>39.671 ns</td><td>1.2097 ns</td><td>3.5669 ns</td><td>41.339 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>0</td><td>40.276 ns</td><td>0.8909 ns</td><td>1.8792 ns</td><td>39.494 ns</td><td>1.02</td><td>0.12</td>
</tr><tr><td>Concat</td><td>1</td><td>153.116 ns</td><td>1.1592 ns</td><td>0.9050 ns</td><td>152.706 ns</td><td>0.87</td><td>0.01</td>
</tr><tr><td>Yield</td><td>1</td><td>154.522 ns</td><td>0.2890 ns</td><td>0.2256 ns</td><td>154.479 ns</td><td>0.88</td><td>0.00</td>
</tr><tr><td>Accumulator</td><td>1</td><td>223.342 ns</td><td>0.3678 ns</td><td>0.2872 ns</td><td>223.412 ns</td><td>1.27</td><td>0.00</td>
</tr><tr><td>Builder</td><td>1</td><td>232.194 ns</td><td>0.2951 ns</td><td>0.2465 ns</td><td>232.265 ns</td><td>1.32</td><td>0.00</td>
</tr><tr><td>BuilderOpt</td><td>1</td><td>232.016 ns</td><td>0.5654 ns</td><td>0.4722 ns</td><td>232.170 ns</td><td>1.31</td><td>0.00</td>
</tr><tr><td>BuilderChained</td><td>1</td><td>176.473 ns</td><td>0.3918 ns</td><td>0.3272 ns</td><td>176.341 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>1</td><td>219.262 ns</td><td>6.7995 ns</td><td>6.3603 ns</td><td>217.003 ns</td><td>1.24</td><td>0.03</td>
</tr><tr><td>Concat</td><td>10</td><td>1,284.042 ns</td><td>1.7035 ns</td><td>1.4225 ns</td><td>1,283.443 ns</td><td>1.68</td><td>0.05</td>
</tr><tr><td>Yield</td><td>10</td><td>6,532.667 ns</td><td>12.6169 ns</td><td>10.5357 ns</td><td>6,533.504 ns</td><td>8.55</td><td>0.24</td>
</tr><tr><td>Accumulator</td><td>10</td><td>2,701.483 ns</td><td>4.8509 ns</td><td>4.5376 ns</td><td>2,700.208 ns</td><td>3.54</td><td>0.10</td>
</tr><tr><td>Builder</td><td>10</td><td>1,865.668 ns</td><td>5.0275 ns</td><td>3.9252 ns</td><td>1,866.920 ns</td><td>2.44</td><td>0.07</td>
</tr><tr><td>BuilderOpt</td><td>10</td><td>1,820.402 ns</td><td>2.7853 ns</td><td>2.3258 ns</td><td>1,820.464 ns</td><td>2.38</td><td>0.07</td>
</tr><tr><td>BuilderChained</td><td>10</td><td>764.334 ns</td><td>19.8528 ns</td><td>23.6334 ns</td><td>756.988 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>10</td><td>1,177.186 ns</td><td>1.9584 ns</td><td>1.6354 ns</td><td>1,177.897 ns</td><td>1.54</td><td>0.04</td>
</tr><tr><td>Concat</td><td>100</td><td>25,579.773 ns</td><td>824.1504 ns</td><td>688.2028 ns</td><td>25,288.873 ns</td><td>5.33</td><td>0.14</td>
</tr><tr><td>Yield</td><td>100</td><td>421,872.560 ns</td><td>902.5023 ns</td><td>753.6302 ns</td><td>421,782.071 ns</td><td>87.87</td><td>0.23</td>
</tr><tr><td>Accumulator</td><td>100</td><td>80,579.168 ns</td><td>227.7392 ns</td><td>177.8038 ns</td><td>80,547.868 ns</td><td>16.78</td><td>0.05</td>
</tr><tr><td>Builder</td><td>100</td><td>15,047.790 ns</td><td>26.2248 ns</td><td>21.8989 ns</td><td>15,048.903 ns</td><td>3.13</td><td>0.01</td>
</tr><tr><td>BuilderOpt</td><td>100</td><td>15,287.117 ns</td><td>39.8679 ns</td><td>31.1262 ns</td><td>15,293.739 ns</td><td>3.18</td><td>0.01</td>
</tr><tr><td>BuilderChained</td><td>100</td><td>4,800.966 ns</td><td>11.3614 ns</td><td>10.0716 ns</td><td>4,801.450 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>100</td><td>8,382.896 ns</td><td>87.8963 ns</td><td>68.6236 ns</td><td>8,368.400 ns</td><td>1.75</td><td>0.01</td>
</tr></tbody></table>
</body>
</html>
回答2:
I prefer using String.concat
:
let names = [ "John"; "Paul"; "George"; "Ringo" ]
names
|> List.mapi (fun i n -> if i = names.Length - 1 && i > 0 then "and " + n else n)
|> String.concat ", "
|> printfn "%s"
回答3:
Basic techniques are mentioned in the accepted answer: problem deconstruction and separation of concerns. There is either no element, or there is an element followed by either .
, , and
, or ,
, depending on its position relative to the end of the input sequence.
Assuming that the input is of type string list
, this can be fairly well expressed by a recursive, pattern matching function definition, wrapped inside a list sequence expression to ensure tail recursion. The match does nothing if the input is empty, so it returns an empty list; it returns a sub-list for the other terminating case, otherwise it appends to the sub-list the results of the recursion.
The concatenation as the desired target type string
is a separate, final step, as proposed in another answer.
let rec seriesComma xs = [
match xs with
| [] -> ()
| [x] -> yield! [x; "."]
| x::[y] -> yield! [x; ", and "]; yield! seriesComma [y]
| x::xs -> yield! [x; ", "]; yield! seriesComma xs ]
["Chico"; "Harpo"; "Groucho"; "Gummo"; "Zeppo"]
|> seriesComma |> String.concat ""
// val it : string = "Chico, Harpo, Groucho, Gummo, and Zeppo."
回答4:
Seq.Reduce is the simplest way to make a delimited list, but including the "and" before the last item adds some complexity. Below I show a way to do it in two steps, but the recursive approach in the accepted answer is probably more true to the Functional Programming paradigm.
let names = [ "John"; "Paul"; "George"; "Ringo" ]
let delimitedNames = names |> Seq.reduce (fun x y -> sprintf "%s, %s" x y)
let replaceLastOccurrence (hayStack: string) (needle: string) (newNeedle: string) =
let idx = hayStack.LastIndexOf needle
match idx with
| -1 -> hayStack
| _ -> hayStack.Remove(idx, needle.Length).Insert(idx, newNeedle)
replaceLastOccurrence delimitedNames "," ", and"
See https://msdn.microsoft.com/en-us/visualfsharpdocs/conceptual/seq.reduce%5B%27t%5D-function-%5Bfsharp%5D?f=255&MSPPError=-2147217396
回答5:
Well, a more functional-looking solution could be something like this:
let names = [ "John"; "Paul"; "George"; "Ringo" ]
names
|> Seq.tailBack
|> Seq.iter (fun s -> printf "%s, " s)
names
|> Seq.last
|> fun s -> printf "and %s" s
Where tailBack
can be defined in some SequenceExtensions.fs
like
module Seq
let tailBack seq =
seq
|> Seq.rev
|> Seq.tail
|> Seq.rev
This way you do not deal much with indexes, variables and all that procedural stuff.
Ideally you would leverage options here, like
names
|> Seq.tryLast
|> Option.iter (fun s -> printf "and %s" s)
With this you would also avoid possible argument exceptions. But options in functional programming is another (nice) concept than sequences.
Also, here a particular task matters. I believe this solution is quite inefficient - we iterate the sequence too many times. Maybe in some cases fussing with indexes will be the way to go.
来源:https://stackoverflow.com/questions/53088411/f-how-should-i-think-about-delimiting-items-in-sequences