Alternative to Get-Content

后端 未结 3 673
再見小時候
再見小時候 2020-12-11 10:40

I currently have the following line of code.

(Get-Content \'file.txt\') |
  ForEach-Object {$_ -replace \'\"\', \'\'} |
  Set-Content \'file.txt\'

相关标签:
3条回答
  • 2020-12-11 11:11

    This should be faster than line-by-line processing, and still keep your memory consumption under control:

    Get-content 'file.txt' -ReadCount 5000 |
     foreach-object {$_ -replace '"', '' | 
     add-content 'newfile.txt' }
    
    0 讨论(0)
  • 2020-12-11 11:13

    Your problem isn't caused by Get-Content, but by the fact that you're running the statement in an expression (i.e. in parentheses). Running Get-Content like that is a convenient way of allowing a pipeline to write data back to the same file. However, the downside of this approach is that the entire file is read into memory before the data is passed into the pipeline (otherwise the file would still be open for reading when Set-Content tries to write data back to it).

    For processing large files you must remove the parentheses and write the output to a temporary file that you rename afterwards.

    Get-Content 'C:\path\to\file.txt' |
      ForEach-Object {$_ -replace '"', ''} |
      Set-Content 'C:\path\to\temp.txt'
    
    Remove-Item 'C:\path\to\file.txt'
    Rename-Item 'C:\path\to\temp.txt' 'file.txt'
    

    Doing this avoids the memory exhaustion you observed. The processing can be sped up further by increasing the read count as @mjolinor suggested (cut execution time down to approximately 40% in my tests).

    For even better performance use the approach with a StreamReader and a StreamWriter that @campbell.rw suggested:

    $reader = New-Object IO.StreamReader 'C:\path\to\file.txt'
    $writer = New-Object IO.StreamWriter 'C:\path\to\temp.txt'
    
    while ($reader.Peek() -ge 0) {
      $line = $reader.ReadLine().Replace('"', '')
      $writer.WriteLine($line)
    }
    
    $reader.Close(); $reader.Dispose()
    $writer.Close(); $writer.Dispose()
    
    Remove-Item 'C:\path\to\file.txt'
    Rename-Item 'C:\path\to\temp.txt' 'file.txt'
    
    0 讨论(0)
  • 2020-12-11 11:19

    Use a stream to read the file, then it won't put it all into memory, you can also use a stream to write the output. This should perform pretty well, and keep memory usage down:

    $file = New-Object System.IO.StreamReader -Arg "c:\test\file.txt"
    $outstream = [System.IO.StreamWriter] "c:\test\out.txt"
    
    while ($line = $file.ReadLine()) {
      $s = $line -replace '"', ''
      $outstream.WriteLine($s)
    }
    $file.close()
    $outstream.close()
    
    0 讨论(0)
提交回复
热议问题