I have a shell command I\'d like to extract data from using Powershell. The data I need will always sit between two key words and the number of lines captured can change. <
One option is to join the text with newlines, then use -split with a multi-line regex:
$text =
(@'
Sites:
System1:
RPAs: OK
Volumes:
WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK
System2:
RPAs: OK
Volumes:
WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK
'@).split("`n") |
foreach {$_.trim()}
$text -join "`n" -split '(?ms)(?=^System\d+:\s*)' -match '^System\d+:'
System1:
RPAs: OK
Volumes:
WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK
System2:
RPAs: OK
Volumes:
WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK
Edit: a more generic solution to just capturing the output between two specific keywords:
$regex = '(?ms)System1:(.+?)System2:'
$text = $text -join "`n"
$OutputText =
[regex]::Matches($text,$regex) |
foreach {$_.groups[1].value -split }
Try this regex:
$result = ($text | Select-String 'System1:\s*\r\n((.*\r\n)*)\s*System2:' -AllMatches)
$result.Matches[0].Groups[1].Value
Where $text is your original input. Note that you might have to adjust your line endings from \r\n to \n depending on your input. You may also have more than one match, I'm not sure from your sample.
The regex starts matching with System1:\s*\r\n
which is System1 followed by any number of spaces, followed by a newline. It ends the match with the literal System2:
. The inner middle, .*\r\n
, matches all characters followed by a newline. The outer middle (.*\r\n)*
says to repeatedly match that pattern. Finally that construct is grouped, ((.*\r\n)*)
so that all the matching lines can be extracted as the result.