Powershell, how to capture argument(s) of Select-String and include with matched output

旧街凉风 提交于 2020-07-19 11:03:50

问题


Thanks to @mklement0 for the help with getting this far with answer given in Powershell search directory for code files with text matching input a txt file.

The below Powershell works well for finding the occurrences of a long list of database field names in a source code folder.

$inputFile = 'C:\DataColumnsNames.txt'
$outputFile = 'C:\DataColumnsUsages.txt'
Get-ChildItem C:\ProjectFolder -Filter *.cs -Recurse -Force -ea SilentlyContinue |
  Select-String -Pattern (Get-Content $inputFile) | 
    Select-Object Path, LineNumber, line | 
      Export-csv $outputfile

However, many lines of source code have multiple matches, especially ADO.NET SQL statements with a lot of field names on one line. If the field name argument was included with the matching output the results will be more directly useful with less additional massaging such as lining up everything with the original field name list. For example if there is a source line "BatchId = NewId" it will match field name list item "BatchId". Is there an easy way to include in the output both "BatchId" and "BatchId = NewId"?

Played with the matches object but it doesn't seem to have the information. Also tried Pipeline variable like here but X is null.

$inputFile = 'C:\DataColumnsNames.txt'
$outputFile = 'C:\DataColumnsUsages.txt'
Get-ChildItem C:\ProjectFolder -Filter *.cs -Recurse -Force -ea SilentlyContinue |
  Select-String -Pattern (Get-Content $inputFile -PipelineVariable x) | 
    Select-Object $x, Path, LineNumber, line | 
      Export-csv $outputile

Thanks.


回答1:


The Microsoft.PowerShell.Commands.MatchInfo instances that Select-String outputs have a Pattern property that reflects the specific pattern among the (potential) array of patterns passed to -Pattern that matched on a given line.

The caveat is that if multiple patterns match, .Pattern only reports the pattern among those that matched that is listed first among them in the -Pattern argument.

Here's a simple example, using an array of strings to simulate lines from files as input:

'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' | 
  Select-String -Pattern ('bar', 'foo') | 
    Select-Object  Line, LineNumber, Pattern

The above yields:

Line                         LineNumber Pattern
----                         ---------- -------
A fool and                            1 foo
his barn                              2 bar
foo and bar on the same line          4 bar

Note how 'bar' is listed as the Pattern value for the last line, even though 'foo' appeared first in the input line, because 'bar' comes before 'foo' in the pattern array.


To reflect the actual pattern that appears first on the input line in a Pattern property, more work is needed:

  • Formulate your array of patterns as a single regex using alternation (|), wrapped as a whole in a capture group ((...)) - e.g., '(bar|foo)')

    • Note: The expression used below, '({0})' -f ('bar', 'foo' -join '|'), constructs this regex dynamically, from an array (the array literal 'bar', 'foo' here, but you can substitute any array variable or even (Get-Content $inputFile)); if you want to treat the input patterns as literals and they happen to contain regex metacharacters (such as .), you'll need to escape them with [regex]::Escape() first.
  • Use a calculated property to define a custom Pattern property that reports the capture group's value, which is the first among the values encountered on each input line:

'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' | 
  Select-String -AllMatches -Pattern ('({0})' -f ('bar', 'foo' -join '|')) | 
    Select-Object Line, LineNumber, 
                  @{ n='Pattern'; e={ $_.Matches[0].Groups[1].Value } }

This yields (abbreviated to show only the last match):

Line                         LineNumber Pattern
----                         ---------- -------
...

foo and bar on the same line          4 foo

Now, 'foo' is properly reported as the matching pattern.


To report all patterns found on each line:

  • Switch -AllMatches is required to tell Select-String to find all matches on each line, represented in the .Matches collection of the MatchInfo output objects.

  • The .Matches collection must then be enumerated (via the .ForEach() collection method) to extract the capture-group value from each match.

'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' | 
  Select-String -AllMatches -Pattern ('({0})' -f ('bar', 'foo' -join '|')) | 
    Select-Object Line, LineNumber, 
                  @{ n='Pattern'; e={ $_.Matches.ForEach({ $_.Groups[1].Value }) } }

This yields (abbreviated to show only the last match):

Line                         LineNumber Pattern
----                         ---------- -------
...

foo and bar on the same line          4 {foo, bar}

Note how both 'foo' and 'bar' are now reported in Pattern, in the order encountered on the line.




回答2:


The solid information and examples from @mklement0 were enough to point me in the right direction for researching and understanding more about Powershell and the object pipeline and calculated properties.

I was able to finally achieve my goals of a cross referencing a list of table and field names to the C# code base.The input file is simply table and field names, pipe delimited. (one of the glitches I had was not using pipe in the split, it was a visual error that took awhile to finally see, so check for that). The output is the table name, field name, code file name, line number and actual line. It's not perfect but much better than manual effort for a few hundred fields! And now there are possibilities for further automation in the data mapping and conversion project. Thought about using C# utility programming but that might have taken just as long to figure out and implement and much more cumbersome that a working Powershell.

The key for me at this point is "working"! My first deeper dive into the abstruse world of Powershell. The key points of my solution are the use of the calculated property to get the table and field names in the output, realization that expressions can be used in certain places like to build a Pattern and that the pipeline is passing only certain specific objects after each command (maybe that is too restricted of a view but it's better than what I had before).

Hope this helps someone in future. I could not find any examples close enough to get over the hump and so asked my first ever stackoverflow questions.

$inputFile = "C:\input.txt"
$outputFile = "C:\output.csv"
$results = Get-Content $inputfile
foreach ($i in $results) {
   Get-ChildItem -Path "C:\ProjectFolder"  -Filter *.cs  -Recurse -ErrorAction SilentlyContinue -Force |
   Select-String -Pattern  $i.Split('|')[1] |
    Select-Object   @{ n='Pattern'; e={ $i.Split('|')[0], $i.Split('|')[1]  -join '|'} },  Filename, LineNumber, line |
Export-Csv $outputFile -Append}


来源:https://stackoverflow.com/questions/62899273/powershell-how-to-capture-arguments-of-select-string-and-include-with-matched

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!