regex | 易学教程

Python beautifulsoup extract value without identifier

阅读更多关于 Python beautifulsoup extract value without identifier

问题 I am facing a problem and don't know how to solve it properly. I want to extract the price (so in the first example 130€, in the second 130€). the problem is that the attributes are changing all the time. so I am unable to do something like this, because I am scraping hundreds of sites and and on each site the first 2 chars of the "id" attribute may differ: tag = soup_expose_html.find('span', attrs={'id' : re.compile(r'(07_content$)')}) Even if I would use something like this it wont work,

extract filename between last slash and question mark

阅读更多关于 extract filename between last slash and question mark

问题 I want to extract filename between last slash and question mark using regex I read some related answers ([^/]*$) But i have several domain names so i want to extract filename of specific names and looking for a regex that work for all domains. How can i limit it to certain domains? My target is to replace the domain name, http://old.domain.com/asda/dsdasd/fsdfd/bvc/filename.mp4?fdgfsdgfgfsgf http://new.domain.com/filename.mp4 Sincerely 回答1: You could try preg_replace('/(.*:\/\/).*\/(.*?)(\?.*

How to extract commas in between square brackets in notepad++?

阅读更多关于 How to extract commas in between square brackets in notepad++?

问题 For example: [TEXT1,TEXT2,TEXT3] my expression: [\[].*,.*[\]] Finds strings with commas (in between brackets,) but I only want to explicitly match the comma that exists in the square brackets. I need to replace the commas with spaces - but only in the square brackets. I've tried [\[],[\]] but that doesn't work - \[(.*?)\] will find the text in between as well - but I do not want the entire string. Can anyone suggest what I need to do to just find commas in between the brackets? 回答1: Find what

pandas extract regex allowing mismatches

阅读更多关于 pandas extract regex allowing mismatches

问题 Pandas has a very fast and nice string method, extract(). This method works perfectly with a regex such as this one: strict_pattern = r"^(?P<pre_spacer>ACGAG)(?P<UMI>.{9,13})(?P<post_spacer>TGGAGTCT)" test_df R1 21 ACGAGTTTTCGTATTTTTGGAGTCTTGTGG 22 ACGAGTAGGGAGGGGGGTGGAGTCTCAGCG 23 ACGAGGGGGGGGAGGCTGGAGTCTCCGGGT 24 ACGAGAATAACGTTTGGTGGAGTCTACCAC 25 ACGAGGGGAATAAATATTGGAGTCTCCTCC 26 ACGAGATTGGGTATGCTGGAGTCTCTGTTC 27 ACGAGGTACCCGCGCCATGGAGTCTCTCTG 28 ACGAGTGGTTTTTGTCGTGGAGTCTCACCA 29

group name can't start with number?

阅读更多关于 group name can't start with number?

问题 It looks like I can't use regex like this one, (?P<74xxx>[0-9]+) With re package it would raise and error, sre_constants.error: bad character in group name u'74xxx' It looks like I can't use group names that starts with a number, why? P.S golang does not have such problem, so does many other languages 回答1: Given the doc: Group names must be valid Python identifiers As the variables, identifiers mustn't start with a number in Python. See more about identifiers here: identifier ::= (letter|"_")

Regex for a number followed by a word

阅读更多关于 Regex for a number followed by a word

问题 In JavaScript, what would be the regular expression for a number followed by a word? I need to catch the number AND the word and replace them both after some calculation. Here are the conditions in a form of example: 123 dollars => Catch the '123' and the 'dollars'. foo bar 0.2 dollars => 0.2 and dollars foo bar.5 dollar => 5 and dollar (notice the dot before 5) foo bar.5.6 dollar => 5.6 and dollar foo bar.5.6.7 dollar => skip (could be only 0 or 1 dot) foo bar5 dollar => skip foo bar 5dollar

Regex for a number followed by a word

阅读更多关于 Regex for a number followed by a word

How to use REGEX to split text to chunks, broken on specific chars?

阅读更多关于 How to use REGEX to split text to chunks, broken on specific chars?

问题 I wish to split a long text into chunks of 1000 chars max, To take as much chars as I can in each chunk but importantly I want to finish each chunk in a linebreak inorder to avoid word split in the middle. If there was no single linebreak in all of the 1000 chars, then I regex will still capture, and split a word to 2 chunks. This Regex /.{1,1000}/gs will split the text to chunks of 1000 chars but it may break a word in the middle. What Regex will give me the wanted results? 回答1: You can use

Splitting a String by number of delimiters

阅读更多关于 Splitting a String by number of delimiters

问题 I am trying to split a string into a string array, there might be number of combinations, I tried: String strExample = "A, B"; //possible option are: 1. A,B 2. A, B 3. A , B 4. A ,B String[] parts; parts = strExample.split("/"); //Split the string but doesnt remove the space in between them so the 2 item in the string array is space and B ( B) parts = strExample.split("/| "); parts = strExample.split(",|\\s+"); Any guidance would be appreciated 回答1: To split with comma enclosed with optional

Regex for name extraction on text file

阅读更多关于 Regex for name extraction on text file

问题 I've got a plain text file containing a list of authors and abstracts and I'm trying to extract just the author names to use for network analysis. My text follows this pattern and contains 500+ abstracts: 2010 - NUCLEAR FORENSICS OF SPECIAL NUCLEAR MATERIAL AT LOS ALAMOS: THREE RECENT STUDIES Purchase this article David L. Gallimore, Los Alamos National Laboratory Katherine Garduno, Los Alamos National Laboratory Russell C. Keller, Los Alamos National Laboratory Nuclear forensics of special