regex

re.findall() where I want all unique instances of the regex on the page

拟墨画扇 提交于 2021-01-27 16:17:38
问题 As the title suggests, I want to run code like this (top_url_list is just a list of urls I'm looping through to find instances of these filename conventions that I'm looking for with regex: name_files = [] for i in top_url_list: result = re.findall("\/([a-z]+[0-9][0-9]\W[a-z]+)", str(urlopen(i).read())) Where the objective is to grab all of the instances where the regex checks out, hence the 'findall()" function. The problem is, it's important that I only get distinct/uniques of each instance

Negative lookbehind in a regex with an optional prefix

亡梦爱人 提交于 2021-01-27 15:51:21
问题 We are using the following regex to recognize urls (derived from this gist by Jim Gruber). This is being executed in Scala using scala.util.matching which in turn uses java.util.regex : (?i)\b((?:https?:(?:/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?!js)[a-z]{2,6}/)(?:[^\s()<>{}\[\]]+)(?:[^\s`!()\[\]{};:'".,<>?«»“”‘’])|(?:(?<!@)[a-z0-9]+(?:[.\-][a-z0-9]+)*[.](?!js)[a-z]{2,6}\b/?(?!@))) This version has escaped forward slashes, for Rubular: (?i)\b(((?:https?:(?:\/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?!js)

Getting port from URL string using Javascript [duplicate]

别来无恙 提交于 2021-01-27 15:50:58
问题 This question already has answers here : Get protocol, domain, and port from URL (18 answers) Closed 5 years ago . I would like a function in javascript that will get as a parameter an url and will return the port of that URL as it follows: If there's a http or https (port 80 / 443) it won't be shown in url structure but I want them returned anyway. If there's another port, I want that to be returned. Example: function myFunction(url){ something here ... return port } I've seen that this can

Negative lookbehind in a regex with an optional prefix

送分小仙女□ 提交于 2021-01-27 15:37:25
问题 We are using the following regex to recognize urls (derived from this gist by Jim Gruber). This is being executed in Scala using scala.util.matching which in turn uses java.util.regex : (?i)\b((?:https?:(?:/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?!js)[a-z]{2,6}/)(?:[^\s()<>{}\[\]]+)(?:[^\s`!()\[\]{};:'".,<>?«»“”‘’])|(?:(?<!@)[a-z0-9]+(?:[.\-][a-z0-9]+)*[.](?!js)[a-z]{2,6}\b/?(?!@))) This version has escaped forward slashes, for Rubular: (?i)\b(((?:https?:(?:\/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?!js)

Regex match word followed by decimal from text

纵饮孤独 提交于 2021-01-27 15:12:39
问题 I want to be able to match the following examples and return array of matches given text: some word another 50.00 some-more 10.10 text another word Matches should be (word, followed by space then decimal number (Optionally followed by another word): another 50.00 some-more 10.10 text I have the following so far: string pat = @"\r\n[A-Za-z ]+\d+\.\d{1,2}([A-Za-z])?"; Regex r = new Regex(pat, RegexOptions.IgnoreCase); Match m = r.Match(input); but it only matches first item: another 50.00 回答1:

0.01 to 99.99 in a regular expression

懵懂的女人 提交于 2021-01-27 14:40:52
问题 I'm trying to do a regular expression that'll allow numbers from 0.01 to 99.99, but not 0.0 or any null value (00.00 or 00.0 or 0.00 or 0.0) or negative value either. I've come quite close, but as usual something just isn't right. 0.0 shows as valid. Can you please help me fix this. Also, you don't need to keep the expression I've done :) <?php if (preg_match('/^[0-9]{1,2}[\.][0-9]{1,2}$/','0.0')) {echo "Valid";}else{echo "Invalid";} ?> 回答1: Here's my attempt: /^(?=.*[1-9])\d{0,2}(?:\.\d{0,2}

Regex to match number with different digits and minimum length

*爱你&永不变心* 提交于 2021-01-27 14:24:11
问题 I am trying to write a regex (to validate a property on a c# .NET Core model, which generates javascript expression) to match all numbers composed by at least two different digits and a minimum length of 6 digits. For example: 222222 - not valid 122222 - valid 1111125 - valid I was trying the following expression: (\d)+((?!\1)(\d)) , which matches the sequence if has different digits but how can I constrain the size of the whole pattern to {6,} ? Many thanks 回答1: You may use ^(?=\d{6})(\d)\1*

Shell regex to end of line

若如初见. 提交于 2021-01-27 14:20:54
问题 I have a file like this little example: # ... # mode=dev # ... Somewhere in this file there is a "variable" within a comment. And i would like to get the value with regex in a Shell script. My code so far: #!/bin/bash conf=$(<"/etc/test.conf") # Get the file content regex='mode=(.*)$' # Set a regex if [[ $conf =~ $regex ]]; then # Search for the regex in the file # We found it, so ... echo "${BASH_REMATCH[1]}" # ... here is the value fi My big problem is, that it will not find the value :( I

Iterate through XML nodes with Lua

核能气质少年 提交于 2021-01-27 14:20:21
问题 I'm trying to iterate through all the 'FindMe' nodes but I'm struggling with the pattern matching. This is going to be used as a plugin in another piece of software so I'm trying to avoid using a parsing library. Given the following xml <?xml version="1.0" encoding="utf-8"?> <NodeA> <NodeB> <FindMe attr="1"> <NodeC attr="1" /> </FindMe> <FindMe attr="2"> <NodeC attr="2" /> </FindMe> </NodeB> </NodeA> When I try this it only prints the last match for k, _ in src:gmatch(".+(<FindMe .+</FindMe>)

In Bash regular expressions do `^` and `$` refer to lines, or to the entire string?

夙愿已清 提交于 2021-01-27 14:11:38
问题 In The Linux Documentation Project (I didn't find details about the regex metacharacters in the Bash manual), the metachars ^ and $ are defined as matching lines: ^ : Matches the empty string at the beginning of a line [...] $ : Matches the empty string at the end of a line however, when I try, this is not correct: $ string="a > b > c" $ [[ $string =~ ^a ]] && echo BOS match BOS match $ [[ $string =~ ^b ]] && echo BOL match # nothing Are the manuals really wrong, or I am missing something?