问题
I have a text in which I want to get only the hexadecimal codes. Like: "thisissometextthisistext\x64\x6f\x6e\x74\x74\x72\x61\x6e\x73\x6c\x61\x74\x65somemoretextoverhere"
It's possible to get the hex codes with \x.. But it doesn't seems I can do something like (^\x..) to select everything but the hex codes.
Any workarounds?
回答1:
You may use a (?s)((?:\\x[a-fA-F0-9]{2})+)|. regex (that will match and capture into Group 1 any 1+ sequences of hex values OR will just match any other char including a line break char) and replace with a conditional replacement pattern (?{1}$1\n:) (that will reinsert the hex value chain or will replace the match with an empty string):
Find What: (?s)((?:\\x[a-fA-F0-9]{2})+)|.
Replace With: (?{1}$1\n:)
Regex Details:
(?s)- same as.matches newline option ON((?:\\x[a-fA-F0-9]{2})+)- Group 1 capturing one or more sequences of\\x- a\\x[a-fA-F0-9]{2}- 2 letters fromatofor digits
|- or.- any single char.
Replacement pattern:
(?{1}- if Group 1 matches:$1\n- replace with its contents + a newline:- else replace with an empty string
)- end of the replacement pattern.
回答2:
try ^.*?((\\x[a-f0-9]{2})+).*$ and replace with $1
and it should just leave the hex code
then after replace
回答3:
If you are already able to find the hexcodes with your regex, couldn't you just use that information to delete all of the hexcodes from the string (or from a clone of the string if you need to preserve the original) and you would be left with all text except for hexcodes.
回答4:
^ acts as a negation token only inside (and at the beginning) of a character class, you can't use it to negate substrings of several characters.
To select all that isn't \xhh you can use this pattern:
\G(?:\\x[a-f0-9]{2})*+\K(?=.|\n)[^\\]*(?:\\(?!x[a-f0-9]{2})[^\\]*)*
it matches the \xhhs first and removes them from the match using the \K feature (that removes all on the left). The other part of the pattern [^\\]*(?:\\(?!x[a-f0-9]{2})[^\\]*)* matches all that isn't a \xhh. Since this subpattern can match the empty string at the end of the string, I added the lookahead (?=.|\n) to ensure there's at least one character.
\G forces all matches to be contigous. In other words it matches the position at the end of the previous match.
来源:https://stackoverflow.com/questions/45001953/cant-use-to-say-all-but