Reading a text file word by word

后端 未结 9 959
梦谈多话
梦谈多话 2020-12-10 18:58

I have a text file containing just lowercase letters and no punctuation except for spaces. I would like to know the best way of reading the file char by char, in a way that

9条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-10 19:40

    If you're interested in good performance even on very large files, you should have a look at the new(4.0) MemoryMappedFile-Class.

    For example:

    using (var mappedFile1 = MemoryMappedFile.CreateFromFile(filePath))
    {
        using (Stream mmStream = mappedFile1.CreateViewStream())
        {
            using (StreamReader sr = new StreamReader(mmStream, ASCIIEncoding.ASCII))
            {
                while (!sr.EndOfStream)
                {
                    var line = sr.ReadLine();
                    var lineWords = line.Split(' ');
                }
            }  
        }
    }
    

    From MSDN:

    A memory-mapped file maps the contents of a file to an application’s logical address space. Memory-mapped files enable programmers to work with extremely large files because memory can be managed concurrently, and they allow complete, random access to a file without the need for seeking. Memory-mapped files can also be shared across multiple processes.

    The CreateFromFile methods create a memory-mapped file from a specified path or a FileStream of an existing file on disk. Changes are automatically propagated to disk when the file is unmapped.

    The CreateNew methods create a memory-mapped file that is not mapped to an existing file on disk; and are suitable for creating shared memory for interprocess communication (IPC).

    A memory-mapped file is associated with a name.

    You can create multiple views of the memory-mapped file, including views of parts of the file. You can map the same part of a file to more than one address to create concurrent memory. For two views to remain concurrent, they have to be created from the same memory-mapped file. Creating two file mappings of the same file with two views does not provide concurrency.

提交回复
热议问题