regex | 易学教程

Extract companies' register number in Python by getting the next word

阅读更多关于 Extract companies' register number in Python by getting the next word

问题 I am trying to get the German Handelsregisternummer (companies' register number) which usually is directly written behind the word HRB . However there are exceptions which I would like to catch with my regex. The goal is to call the function and set the keyword (in this case it is HRB ). Then the function returns the number. Please see regex demo! This is what I have so far! This doesn't catch all cases. def get_company_register_number(string, keyword): reg_1 = fr'\b{keyword}\b[,:|\s]*(\w+)'

How to split a sentence into words and punctuations in java

阅读更多关于 How to split a sentence into words and punctuations in java

问题 I want to split a given sentence of type string into words and I also want punctuation to be added to the list. For example, if the sentence is: "Sara's dog 'bit' the neighbor." I want the output to be: [Sara's, dog, ', bit, ', the, neighbour, .] With string.split(" ") I can split the sentence in words by space, but I want the punctuation also to be in the result list. String text="Sara's dog 'bit' the neighbor." String list = text.split(" ") the printed result is [Sara's, dog,'bit', the,

Regex Match Domain Extension

阅读更多关于 Regex Match Domain Extension

问题 I need to confirm that the domain extension is present. So far I have not been able to get a match for the domain extension Where the domain name can have wild cards: gmail.com, msn.com, mac.com, comcast.net DomainPartOfEmail = Right(temp, (Len(temp) - temp.LastIndexOf("@") - 1)) If Regex.IsMatch(DomainPartOfEmail, "*.edu? | *.com? | *.net? | *.org?", RegexOptions.IgnoreCase) Then ValidDomain = True End If 回答1: If the domains are only from these(edu, com, net, org) then use this one: ".*\.

Matching Unicode letter characters in PCRE/PHP

阅读更多关于 Matching Unicode letter characters in PCRE/PHP

问题 I'm trying to write a reasonably permissive validator for names in PHP, and my first attempt consists of the following pattern: // unicode letters, apostrophe, hyphen, space $namePattern = "/^([\\p{L}'\\- ])+$/"; This is eventually passed to a call to preg_match() . As far as I can tell, this works with your vanilla ASCII alphabet, but seems to trip up on spicier characters like Ă or 张. Is there something wrong with the pattern itself? Perhaps I'm expecting \p{L} to do more work than I think

Regex syntax for replacing multiple strings: where have I gone wrong?

阅读更多关于 Regex syntax for replacing multiple strings: where have I gone wrong?

问题 I have a dataframe with the column 'purpose' that has a lot of string values that I want to standardize by finding a string and replacing it. For instance, some very similar values are car purchase, buying a second-hand car, buying my own car, cars, second-hand car purchase, car, to own a car, purchase of a car, to buy a car I used the following to make this change: #replace anything to do with buying a car with "Vehicle" credit_data['purpose'] = credit_data.purpose.str.replace(r'(^.*car.*$)'

preg_match() returns 0 although regex testers work [duplicate]

阅读更多关于 preg_match() returns 0 although regex testers work [duplicate]

问题 This question already has answers here : Matching Unicode letter characters in PCRE/PHP (5 answers) Closed 9 months ago . I'm trying to validate a string using /[\p{L}\s]{6,}/ and trying to match characters only (Unicode ones as well). I used regex101 to test my regex and it works for the string Владимир Алексић . However, when I use that regex in preg_match() with the same string, it always returns 0 . Yet, it returns 1 if I avoid all characters except A-Za-z . Why is that so? 回答1: The \p

Regex to block url in nginx

阅读更多关于 Regex to block url in nginx

问题 I want to block access to urls that have excess characters at its end. E.g. I want nginx to block requests to https://www.example.com/url-pattern/amp/extra-chars/more-extra but want it to allow https://www.example.com/url-pattern/amp or https://www.example.com/url-pattern/amp/ Will this work? location .*\/amp\/. { deny all } Please guide. 回答1: Solved it myself. If anyone is looking for the same solution location ~* /amp/. { deny all; } 来源： https://stackoverflow.com/questions/62984495/regex-to

Is regex or replace method best to clean up list ? re Pandas environment

阅读更多关于 Is regex or replace method best to clean up list ? re Pandas environment

问题 From the list below I'm able to remove the non-alphabet characters but fall short all the same. I want the Draw eliminated without affecting the desired outcome. df=pd.DataFrame({'Teams': ['Lakefield United', '101002 Castle FC pk, +½ 1.81 o 3.05 o Un 2 1.92 o', '101003 Draw 3.00 o', 'Boms', '101005 Riverside FC pk 2.11 o 2.86 o Un 2, 2½ 1.78 o', '101006 Draw 3.10 o', 'Barmley', '101011 Colsely Lakers -1, -1½ 2.04 o 1.46 o Un 2½, 3 1.83 o', '101012 Draw 4.40 o',]}) Desired Elements :

IIS: Rewrite URL with regex but keep query strings

阅读更多关于 IIS: Rewrite URL with regex but keep query strings

问题 I'm have the following link https://example.com/myapp/green?&lang=en&instance=some%20instance I need to rewrite it to the following link https://example.com/myapp?color=green&lang=en&instance=some%20instance The color in the link can be any color but it needs to be rewritten like in the 2nd link so that the trailing slash is replaced with a ? followed by the word color= and the ? at the end of the color word needs to be removed. /myapp/green? becomes /myapp?color=green , /myapp/blue? becomes

Replacing a block of text in powershell

阅读更多关于 Replacing a block of text in powershell

问题 I have the following (sample) text: line1 line2 line3 I would like to use the powershell -replace method to replace the whole block with: lineA lineB lineC I'm not sure how to format this to account for the carriage returns/line breaks... Just encapsulating it in quotes like this doesn't work: {$_ -replace "line1 line2 line3", "lineA lineB lineC"} How would this be achieved? Many thanks! 回答1: There is nothing syntactically wrong with your command - it's fine to spread string literals and