Recursively search a directory for files whose content matches a regex and collect the paths of matching files in an array

霸气de小男生 提交于 2020-05-28 06:37:26

问题


$locations = Get-ChildItem $readLoc -recurse | ? {!$_.psiscontainer} | select-object name | %{$e = $_.name; get-content $e}

$array = @()

for($i = 0; $i -lt $locations.length; $i++){
    #if($locations.name[$i].length -eq "9"){
        $paths = Resolve-Path $locations.fullname[$i]
        $paths.path
        get-content $locations.name[$i]
        #$array += $paths.path 
    #}
}

I need to iterate through each file in the file system and open each file. I am checking to see if a string within the file matches a regular expression and then output the full path to that file into an array.

However, $locations isn't accepting the get-content.

get-content : Cannot find path

'C:\Users\xxxxxx\Documents\files\powershell\OWASP_ApplicationThreatModeling.docx'
because it does not exist.
At line:1 char:89
+ ... .psiscontainer} | select-object name |%{$e = $_.name; get-content $e}
+                                                           ~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\Users\p61782...atModeling.docx:String) [Get-Content], ItemNotFoundEx
   ception
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetContentCommand.

回答1:


As TheMadTechnician suggests, it's more efficient to use Select-String to perform the regex matching:

$locations = Get-ChildItem $readLoc -File -Recurse |
               Select-String -List -Pattern '^\d{3}-?\d{2}-?\d{4}$' | 
                 Select-Object -ExpandProperty Path

Note:
- The regex passed to -Pattern is a simplified version of the one linked to in a comment.
Note how the regex is enclosed in '...' rather than "..." so as to prevent inadvertent up-front interpretation of the string by PowerShell.

  • Get-ChildItem $readLoc -File -recurse recursively enumerates all files in the target directory's subtree. Switch -File (along with its counterpart, -Directory) is available in PSv3+ and makes your ? {!$_.psiscontainer} filter unnecessary.

  • Select-String can operate on the content of files piped via Get-ChildItem and performs regex matching by default:

    • -List tells Select-String to only return the first match from each input file (if any).
  • Select-String returns match-information objects whose .Path property contains the full path of the input file, so Select-Object -ExpandProperty Path is used to output just the path of any file that contains at least 1 match.

Overall, variable $locations therefore receives the array of full paths of those files in which least 1 line matches the regex of interest.
Note that PowerShell automatically collects output from a command in an array, if the output comprises more than 1 element.


As for what you tried:

  • Your immediate problem was that you passed .Name - i.e., a mere file name - to Get-Content rather than .FullName.

  • Furthermore, your apparent intent was to collect file-info objects in array $locations, whereas your pipeline actually produced the contents of all files (as an array of lines).




回答2:


You need to work with the FullName property. Right now you're stripping that with your Select-Object command.

$locations = Get-ChildItem $readLoc -recurse | ? {!$_.psiscontainer}

for($i = 0; $i -lt $locations.length; $i++){
    $locations[$i].fullname
    get-content $locations[$i].fullname
}


来源:https://stackoverflow.com/questions/44552151/recursively-search-a-directory-for-files-whose-content-matches-a-regex-and-colle

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!