I have two arrays built while parsing a text file. The first contains the column names, the second contains the values from the current row. I need to iterate over both list
If you're really using arrays, the best way is probably just to use the conventional for
loop with indices. Not as nice, granted, but as far as I know .NET doesn't offer a better way of doing this.
You could also encapsulate your code into a method called zip
– this is a common higher-order list function. However, C# lacking a suitable Tuple type, this is quite crufty. You'd end up returning an IEnumerable<KeyValuePair<T1, T2>>
which isn't very nice.
By the way, are you really using IEnumerable
instead of IEnumerable<T>
or why do you cast the Current
value?
In a functional language you would usually find a "zip" function which will hopefully be part of a C#4.0 . Bart de Smet provides a funny implementation of zip based on existing LINQ functions:
public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> func)
{
return first.Select((x, i) => new { X = x, I = i })
.Join(second.Select((x, i) => new { X = x, I = i }),
o => o.I,
i => i.I,
(o, i) => func(o.X, i.X));
}
Then you can do:
int[] s1 = new [] { 1, 2, 3 };
int[] s2 = new[] { 4, 5, 6 };
var result = s1.Zip(s2, (i1, i2) => new {Value1 = i1, Value2 = i2});
Instead of creating two seperate arrays you could make a two-dimensional array, or a dictionary (which would be better). But really, if it works I wouldn't try to change it.
Use IEnumerator for both would be nice
var currentValues = currentRow.Split(separatorChar);
using (IEnumerator<string> valueEnum = currentValues.GetEnumerator(), columnEnum = columnList.GetEnumerator()) {
while (valueEnum.MoveNext() && columnEnum.MoveNext())
valueMap.Add(columnEnum.Current, valueEnum.Current);
}
Or create an extension methods
public static IEnumerable<TResult> Zip<T1, T2, TResult>(this IEnumerable<T1> source, IEnumerable<T2> other, Func<T1, T2, TResult> selector) {
using (IEnumerator<T1> sourceEnum = source.GetEnumerator()) {
using (IEnumerator<T2> otherEnum = other.GetEnumerator()) {
while (sourceEnum.MoveNext() && columnEnum.MoveNext())
yield return selector(sourceEnum.Current, otherEnum.Current);
}
}
}
Usage
var currentValues = currentRow.Split(separatorChar);
foreach (var valueColumnPair in currentValues.Zip(columnList, (a, b) => new { Value = a, Column = b }) {
valueMap.Add(valueColumnPair.Column, valueColumnPair.Value);
}
You've got a non-obvious pseudo-bug in your initial code - IEnumerator<T>
extends IDisposable
so you should dispose it. This can be very important with iterator blocks! Not a problem for arrays, but would be with other IEnumerable<T>
implementations.
I'd do it like this:
public static IEnumerable<TResult> PairUp<TFirst,TSecond,TResult>
(this IEnumerable<TFirst> source, IEnumerable<TSecond> secondSequence,
Func<TFirst,TSecond,TResult> projection)
{
using (IEnumerator<TSecond> secondIter = secondSequence.GetEnumerator())
{
foreach (TFirst first in source)
{
if (!secondIter.MoveNext())
{
throw new ArgumentException
("First sequence longer than second");
}
yield return projection(first, secondIter.Current);
}
if (secondIter.MoveNext())
{
throw new ArgumentException
("Second sequence longer than first");
}
}
}
Then you can reuse this whenever you have the need:
foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar),
(column, value) => new { column, value })
{
// Do something
}
Alternatively you could create a generic Pair type, and get rid of the projection parameter in the PairUp method.
EDIT:
With the Pair type, the calling code would look like this:
foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar))
{
// column = pair.First, value = pair.Second
}
That looks about as simple as you can get. Yes, you need to put the utility method somewhere, as reusable code. Hardly a problem in my view. Now for multiple arrays...
If the arrays are of different types, we have a problem. You can't express an arbitrary number of type parameters in a generic method/type declaration - you could write versions of PairUp for as many type parameters as you wanted, just like there are Action
and Func
delegates for up to 4 delegate parameters - but you can't make it arbitrary.
If the values will all be of the same type, however - and if you're happy to stick to arrays - it's easy. (Non-arrays is okay too, but you can't do the length checking ahead of time.) You could do this:
public static IEnumerable<T[]> Zip<T>(params T[][] sources)
{
// (Insert error checking code here for null or empty sources parameter)
int length = sources[0].Length;
if (!sources.All(array => array.Length == length))
{
throw new ArgumentException("Arrays must all be of the same length");
}
for (int i=0; i < length; i++)
{
// Could do this bit with LINQ if you wanted
T[] result = new T[sources.Length];
for (int j=0; j < result.Length; j++)
{
result[j] = sources[j][i];
}
yield return result;
}
}
Then the calling code would be:
foreach (var array in Zip(columns, row, whatevers))
{
// column = array[0]
// value = array[1]
// whatever = array[2]
}
This involves a certain amount of copying, of course - you're creating an array each time. You could change that by introducing another type like this:
public struct Snapshot<T>
{
readonly T[][] sources;
readonly int index;
public Snapshot(T[][] sources, int index)
{
this.sources = sources;
this.index = index;
}
public T this[int element]
{
return sources[element][index];
}
}
This would probably be regarded as overkill by most though ;)
I could keep coming up with all kinds of ideas, to be honest... but the basics are:
if there are the same number of column names as there are elements in each row, could you not use a for loop?
var currentValues = currentRow.Split(separatorChar);
for(var i=0;i<columnList.Length;i++){
// use i to index both (or all) arrays and build your map
}