Recommended behaviour of GetEnumerator() when implementing IEnumerable<T> and IEnumerator<T>

时光毁灭记忆、已成空白 提交于 2019-12-22 05:04:16

问题


I am implementing my own enumerable type. Something ressembling this:

public class LineReaderEnumerable : IEnumerable<string>, IDisposable
{
    private readonly LineEnumerator enumerator;

    public LineReaderEnumerable(FileStream fileStream)
    {
        enumerator = new LineEnumerator(new StreamReader(fileStream, Encoding.Default));
    }

    public IEnumerator<string> GetEnumerator()
    {
        return enumerator;
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    public void Dispose()
    {
       enumerator.Dispose();
    }
}

The enumerator class:

public class LineEnumerator : IEnumerator<string>
{
    private readonly StreamReader reader;
    private string current;

    public LineEnumerator(StreamReader reader)
    {
        this.reader = reader;
    }

    public void Dispose()
    {
        reader.Dispose();
    }

    public bool MoveNext()
    {
        if (reader.EndOfStream)
        {
            return false;
        }
        current = reader.ReadLine();
        return true;
    }

    public void Reset()
    {
        reader.DiscardBufferedData();
        reader.BaseStream.Seek(0, SeekOrigin.Begin);
        reader.BaseStream.Position = 0;
    }

    public string Current
    {
        get { return current; }
    }

    object IEnumerator.Current
    {
        get { return Current; }
    }
}

My question is this: should I call Reset() on the enumerator when GetEnumerator() is called or is it the responsability of the calling method (like foreach) to do it?

Should GetEnumerator() create a new one, or is it supposed to always return the same instance?


回答1:


The expectation of a user of your type would be that GetEnumerator() returns a new enumerator object.

As you have defined it every call to GetEnumerator returns the same enumerator, so code like:

var e1 = instance.GetEnumerator();
e1.MoveNext();
var first = e1.Value();

var e2 = instance.GetEnumerator();
e2.MoveNext();
var firstAgain = e2.Value();

Debug.Assert(first == firstAgain);

will not work as expected.

(An internal call to Reset would be an unusual design, but that's secondary here.)

Additional: PS If you want an enumerator over the lines of a file then use File.ReadLines, but it appears (see comments on Jon Skeet's answer) this suffers from the same problem as your code.




回答2:


Your model is fundamentally broken - you should create a new IEnumerator<T> each time GetEnumerator() is called. Iterators are meant to be independent of each other. For example, I ought to be able to write:

var lines = new LinesEnumerable(...);
foreach (var line1 in lines)
{
    foreach (var line2 in lines)
    {
        ...
    }
}

and basically get the cross-product of each line in the file against each of the other lines.

This means LineEnumerable class should not be given a FileStream - it should be given something which can be used to obtain a FileStream each time you need one, e.g. a filename.

For example, you can do all of this in a single method call using iterator blocks:

// Like File.ReadLines in .NET 4 - except that's broken (see comments)
public IEnumerable<string> ReadLines(string filename)
{
    using (TextReader reader = File.OpenText(filename))
    {
        string line;
        while ((line = reader.ReadLine()) != null)
        {
            yield return line;
        }
    }
}

Then:

var lines = ReadLines(filename);
// foreach loops as before

... that will work fine.

EDIT: Note that certain sequences are naturally iterable only once - e.g. a network stream, or a sequence of random numbers from an unknown seed.

Such sequences are really better expressed as IEnumerator<T> rather than IEnumerable<T>, but that makes filtering etc with LINQ harder. IMO such sequences should at least throw an exception on the second call to GetEnumerator() - returning the same iterator twice is a really bad idea.




回答3:


Should GetEnumerator() create a new one, or is it supposed to always return the same instance?

If you return the same instance then the second iteration will be returning results from the point where the first iteration is and both of them will interfere with each other if the code is executing alternatively or in parallel. so No you shouldn't return same instance.

For Reset

An enumerator remains valid as long as the collection remains unchanged. If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and the next call to the MoveNext or Reset method throws an InvalidOperationException.

The Reset method is provided for COM interoperability. It does not necessarily need to be implemented; instead, the implementer can simply throw a NotSupportedException.

http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.reset.aspx




回答4:


My question is this: should I call Reset() on the enumerator when GetEnumerator() is called or is it the responsability of the calling method (like foreach) to do it?

That is the responsability of the calling method; However if your enumerator is invalid before a first call to Reset() you should of course call it before returning it (that would be an implementation detail).

In normal operation, an enumerator is never actually reset. You can verify that by throwing NotSupportedException from within reset.

Should GetEnumerator() create a new one, or is it supposed to always return the same instance?

Yes it should always return a new instance. Think of it this way: an Enumerable is something that you can enumerate. Enumerator is the thing that you use to enumerate with. If GetEnumerator() always returned the same instance, the containing class would not be 'enumerable' but just know how to 'enumerate' (IOW: it would just be IHasEnumerator instead of IEnumerable)




回答5:


As far as I'm concerned, it should be the responsibility of the caller. This follows from POLA (the principle of least astonishment, if you like. And indeed, you don't want your reader to control too much. Consider, what if the consumer wants only to enumerate lines from a certain point in the stream onwards?

And regarding the Reset method itself, you should really check to see if the stream is actually seekable before trying to seek -- many streams aren't (e.g. network streams).



来源:https://stackoverflow.com/questions/7673161/recommended-behaviour-of-getenumerator-when-implementing-ienumerablet-and-ie

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!