Can someone suggest a way to create batches of a certain size in linq?
Ideally I want to be able to perform operations in chunks of some configurable amount.
So with a functional hat on, this appears trivial....but in C#, there are some significant downsides.
you'd probably view this as an unfold of IEnumerable (google it and you'll probably end up in some Haskell docs, but there may be some F# stuff using unfold, if you know F#, squint at the Haskell docs and it will make sense).
Unfold is related to fold ("aggregate") except rather than iterating through the input IEnumerable, it iterates through the output data structures (its a similar relationship between IEnumerable and IObservable, in fact I think IObservable does implement an "unfold" called generate...)
anyway first you need an unfold method, I think this works (unfortunately it will eventually blow the stack for large "lists"...you can write this safely in F# using yield! rather than concat);
static IEnumerable Unfold(Func>> f, U seed)
{
var maybeNewSeedAndElement = f(seed);
return maybeNewSeedAndElement.SelectMany(x => new[] { x.Item2 }.Concat(Unfold(f, x.Item1)));
}
this is a bit obtuse because C# doesn't implement some of the things functional langauges take for granted...but it basically takes a seed and then generates a "Maybe" answer of the next element in the IEnumerable and the next seed (Maybe doesn't exist in C#, so we've used IEnumerable to fake it), and concatenates the rest of the answer (I can't vouch for the "O(n?)" complexity of this).
Once you've done that then;
static IEnumerable> Batch(IEnumerable xs, int n)
{
return Unfold(ys =>
{
var head = ys.Take(n);
var tail = ys.Skip(n);
return head.Take(1).Select(_ => Tuple.Create(tail, head));
},
xs);
}
it all looks quite clean...you take the "n" elements as the "next" element in the IEnumerable, and the "tail" is the rest of the unprocessed list.
if there is nothing in the head...you're over...you return "Nothing" (but faked as an empty IEnumerable>)...else you return the head element and the tail to process.
you probably can do this using IObservable, there's probably a "Batch" like method already there, and you can probably use that.
If the risk of stack overflows worries (it probably should), then you should implement in F# (and there's probably some F# library (FSharpX?) already with this).
(I have only done some rudimentary tests of this, so there may be the odd bugs in there).