I have the following PowerShell script that will parse some very large file for ETL purposes. For starters my test file is ~ 30 MB. Larger files around 200 MB are
The Get-Content
cmdlet does not perform as well as a StreamReader when dealing with very large files. You can read a file line by line using a StreamReader like this:
$path = 'C:\A-Very-Large-File.txt'
$r = [IO.File]::OpenText($path)
while ($r.Peek() -ge 0) {
$line = $r.ReadLine()
# Process $line here...
}
$r.Dispose()
Some performance comparisons:
Measure-Command {Get-Content .\512MB.txt > $null}
Total Seconds: 49.4742533
Measure-Command {
$r = [IO.File]::OpenText('512MB.txt')
while ($r.Peek() -ge 0) {
$r.ReadLine() > $null
}
$r.Dispose()
}
Total Seconds: 27.666803