问题
Similar like this - Extract email:password
However we have here the situation that in some files there is other data between the data I want to parse, as example:
email:lastname:firstname:password or email:lastname:firstname:dob:password
So my question is - with which command would I be able to ignore 2 segments like "lastname:firstname" or even 3 parts "lastname:firstname:dob". I am using the below regex to retrieve email:password from a big list.
$sw = [System.IO.StreamWriter]::new("$PWD/out.txt")
switch -regex -file in.txt {
'(?<=:)[^@:]+@[^:]+:.*' { $sw.WriteLine($Matches[0]) }
}
$sw.Close()
回答1:
You need to refine your regex:
# Create sample input file
@'
...:foo@example.org:password1
...:bar@example.org:lastname:firstname:password2
...:baz@example.org:lastname:firstname:dob:password3
'@ > in.txt
# Process the file line by line.
switch -regex -file in.txt {
'(?<=:)([^@:]+@[^:]+)(?:.*):(.*)' { '{0}:{1}' -f $Matches[1], $Matches[2] }
}
For brevity, saving the output to a file was omitted above, so the email-password pairs extracted print to the screen by default, namely as:
foo@example.org:password1
bar@example.org:password2
baz@example.org:password3
Explanation of the regex:
(?<=:)is a positive lookbehind assertion for ensuring that matching starts right after a:character.- Note: I based this requirement on your original question and its sample data.
([^@:]+@[^:]+)uses a capture group (capturing subexpression,(...)) to match an email address up to but not including the next:.(?:.*):uses a non-capturing subexpression ((?:...)) that matches zero or more characters (.*) unconditionally followed by a:(.*)uses a capture group to capture all remaining characters after what is effectively the last:on each line, assumed to be the password.$Matches[1]and$Matches[2]refer to the 1st and 2nd capture-group matches, i.e. the email address and the password.
回答2:
Assuming you had data like this:
"lastname:firstname"
"lastname:firstname:dob"
"lastname:firstname:password:somepassword"
"lastname:john:firstname:jacob:password:dingleheimershmit
You can move through each row like this:
$items = gc .\stack.txt
ForEach($item in $items){
}
Then we can split each row on a : character and check each of those to see if its a match for the string passwrod. If it is, then we check the next token in the row which should be a password.
This code will get you going, you'll just need to do something meaningful with $password.
$items = gc .\stack.txt
ForEach($item in $items){
"processing $item"
$tokens = $item.Split(":")
For($x=0; $x -lt $tokens.Count;$x++){
$token = $tokens[$x]
#"Checking if $token is like password"
if ($token -match "password"){
"since this token is like password, checking next token which should be a password"
$password = $tokens[$x+1]
Write-Host -ForegroundColor Yellow $password
}
}
}
来源:https://stackoverflow.com/questions/65495920/powershell-specify-emailpassword-with-random-data-in-between