Can I make this script even faster?

强颜欢笑 提交于 2019-12-10 11:31:32

问题


I wrote a simple script for an internship, which trawls through a provided directory and deletes any file older than a specified number of days. I've spent all my free time today attempting to tighten it up. Here's what I've got so far:

function delOld($dir, $numDays){
    $timespan = new-timespan -days $numDays
    $curTime = get-date
    get-childItem $dir -Recurse -file | 
    where-object {(($curTime)-($_.LastWriteTime)) -gt $timespan} | 
    remove-Item -whatif
}

Here is an example of a call of the function:

delOld -dir "C:\Users\me\Desktop\psproject" -numDays 5

Sorry for the difficulty of reading, I found that condensing the operations into one line was more efficient than reassigning them to legible variables each iteration. The remove-item is whatif'd at the moment for testing purposes. I'm aware that at this point, I probably cannot speed it up much, however, I am running it on over a TB of files, so every operation counts.

Thanks in advance for any advice you have to offer!


回答1:


Staying in the realm of PowerShell and .NET methods, here's how you can speed up your function:

  • Calculate the cut-off time stamp once, up front.

  • Use the [IO.DirectoryInfo] type's EnumerateFiles() method (PSv3+ / .NET4+) in combination with a foreach statement. Tip of the hat to wOxxOm.

    • EnumerateFiles() enumerates files one at a time, keeping memory use constant, similar to, but faster than Get-ChildItem.

      • Caveats:

        • EnumerateFiles() invariably includes hidden files, whereas Get-ChildItem excludes them by default, and only includes them if -Force is specified.

        • EnumerateFiles() is unsuitable if there's a chance of encountering inaccessible directories due to lack of permissions, because even if you enclose the entire foreach statement in a try / catch block, you'll only get partial output, given that the iteration stops on encountering the first inaccessible directory.

        • The enumeration order can differ from that of Get-ChildItem.

    • PowerShell's foreach statement is much faster than the ForEach-Object cmdlet, and also faster than the PSv4+ .ForEach() collection method.

  • Invoke the .Delete() method directly on each [System.IO.FileInfo] instance inside the loop body.

Note: For brevity, there are no error checks in the function below, such as for whether $numDays has a permissible value and whether $dir refers to an existing directory (if it's a path based on a custom PS drive, you'd have to resolve it with Convert-Path first).

function delOld($dir, $numDays) {
    $dtCutoff = [datetime]::now - [timespan]::FromDays($numDays)
    # Make sure that the .NET framework's current dir. is the same as PS's:
    [System.IO.Directory]::SetCurrentDirectory($PWD.ProviderPath)
    # Enumerate all files recursively.
    # Replace $file.FullName with $file.Delete() to perform actual deletion.
    foreach ($file in ([IO.DirectoryInfo] $dir).EnumerateFiles('*', 'AllDirectories')) { 
     if ($file.LastWriteTime -lt $dtCutOff) { $file.FullName }
    }
}

Note: The above simply outputs the paths of the files to delete; replace $file.FullName with $file.Delete() to perform actual deletion.




回答2:


Many of the PowerShell cmdlets are slower than their .NET equivalents. You could, for example, make a call to [System.IO.File]::Delete($_.FullName) instead, and see if there is a performance difference. Same goes for Get-ChildItem => [System.IO.Directory]::GetFiles(...).

To do that, I would write a small script that creates two temp folders with say, 100,000 empty test files in each. Then call each version of the function wrapped in [System.Diagnostics.StopWatch].

Some sample code:

$stopwatch = New-Object 'System.Diagnostics.StopWatch'
$stopwatch.Start()

Remove-OldItems1 ...

$stopwatch.Stop()
Write-Host $stopwatch.ElapsedMilliseconds

$stopwatch.Reset()
$stopwatch.Start()

Remove-OldItems2 ...

$stopwatch.Stop()
Write-Host $stopwatch.ElapsedMilliseconds

Further brownie points for PowerShell: Run Get-Verb in a Powershell window and you can see the list of approved verbs. Functions in PowerShell are suggested to be Verb-Noun named, so something like Remove-OldItems would fit the bill.




回答3:


This will delete everything in parallel processing.

workflow delOld([string]$dir, [int]$numDays){
    $timespan = new-timespan -days $numDays
    $curTime = get-date
    $Files = get-childItem $dir -Recurse -file | where-object {(($curTime)-($_.LastWriteTime)) -gt $timespan}
    foreach -parallel ($file in $files){
        Remove-Item $File
    }

}

delOld -dir "C:\Users\AndrewD\Downloads" -numDays 8

Now if its alot of folders try this



来源:https://stackoverflow.com/questions/44378046/can-i-make-this-script-even-faster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!