regex

Latin Regex with symbols

爷,独闯天下 提交于 2021-02-05 05:51:06
问题 I need split a text and get only words, numbers and hyphenated composed-words. I need to get latin words also, then I used \p{L} , which gives me é, ú ü ã, and so forth. The example is: String myText = "Some latin text with symbols, ? 987 (A la pointe sud-est de l'île se dresse la cathédrale Notre-Dame qui fut lors de son achèvement en 1330 l'une des plus grandes cathédrales d'occident) : ! @ # $ % ^& * ( ) + - _ #$% " ' : ; > < / \ | , here some is wrong… * + () e -" Pattern pattern =

JS Regex lookbehind not working in firefox and safari

丶灬走出姿态 提交于 2021-02-05 05:50:13
问题 I have this following regex which is working in chrome but causes an error in firefox or safari. I need to modify it to make it work. Can anybody help out a poor soul? Thanks in advance! regex: /(?=<tag>)(.*?)(?<=<\/tag>)/ Basically, I have to match any char in between <tag> and </tag> and need to retain both tags. I used this expression as an argument to array.split. input: "The quick brown <tag>fox</tag> jumps over the lazy <tag>dog</tag>" operation: input.split(regex) output: ["The quick

Python regex's fuzzy search doesn't return all matches when using the or operator

对着背影说爱祢 提交于 2021-02-05 05:50:09
问题 For example, when I use regex.findall(r"(?e)(mazda2 standard){e<=1}", "mazda 2 standard") , the answer is ['mazda 2 standard'] as usual. But when I use regex.findall(r"(?e)(mazda2 standard|mazda 2){e<=1}", "mazda 2 standard") or regex.findall(r"(?e)(mazda2 standard|mazda 2){e<=1}", "mazda 2 standard", overlapped=True) , the output doesn't contain 'mazda 2 standard' at all. How to make the output contain 'mazda 2 standard' too? 回答1: See PyPi regex documentation: By default, fuzzy matching

Trouble escaping dollar sign in Perl

会有一股神秘感。 提交于 2021-02-05 05:41:11
问题 I'm getting a bunch of text from an outside source, saving it in a variable, and then displaying that variable as part of a larger block of HTML. I need to display it as is, and dollar signs are giving me trouble. Here's the setup: # get the incoming text my $inputText = "This is a $-, as in $100. It is not a 0."; print <<"OUTPUT"; before-regex: $inputText OUTPUT # this regex seems to have no effect $inputText =~ s/\$/\$/g; print <<"OUTPUT"; after-regex: $inputText OUTPUT In real life, those

Trouble escaping dollar sign in Perl

我是研究僧i 提交于 2021-02-05 05:40:49
问题 I'm getting a bunch of text from an outside source, saving it in a variable, and then displaying that variable as part of a larger block of HTML. I need to display it as is, and dollar signs are giving me trouble. Here's the setup: # get the incoming text my $inputText = "This is a $-, as in $100. It is not a 0."; print <<"OUTPUT"; before-regex: $inputText OUTPUT # this regex seems to have no effect $inputText =~ s/\$/\$/g; print <<"OUTPUT"; after-regex: $inputText OUTPUT In real life, those

Python RegEx for exact matches of brackets

人盡茶涼 提交于 2021-02-05 05:37:25
问题 I am trying to parse a string which is of the following format: text="some random string <inAngle> <anotherInAngle> [-option text] [-anotherOption <text>] [-option (Y|N)]" I want to split the string in three parts. Just the "some random string" Everything that is ONLY in angle brackets. I.E inAngle and anotherInAngle above. Everything that is in square brackets. If I use the RegEx re.findall(r'\[(.+?)\]', text) It gives everything I need within square brackets. If I use the same RegEx with

Match whole string in regex

ⅰ亾dé卋堺 提交于 2021-02-05 05:36:46
问题 I have string for example: hasan عمرانی . I want to match persian chars for the whole string. I mean if the string is not entirely persian the regex doesn't match any character. I have this pattern so far: [\x{0600}-\x{06FF}\s]+ . but it matches عمرانی . It must not match any of the string. please help me to provide a pattern. Thanks. 回答1: You can add ^ at the beginning of your expression and $ at the end, to try to match from the beginning to the end of the string being searched. ^[\x{0600}-

Split binary number into groups of zeros and ones

隐身守侯 提交于 2021-02-05 05:35:05
问题 I have a binary number, for example 10000111000011 , and want to split it into groups of consecutive 1s and 0s, 1 0000 111 0000 11 . I thought that's a great opportunity to use look-arounds: my regex uses a positive look-behind for a digit (which it captures for later backreferencing), then a negative look-ahead for that same digit (using a backreference), so I should get a split whenever a digit is followed by a digit that is not the same. use strict; use warnings; use feature 'say'; my $bin

regular expression confusion \s and “ ”

不问归期 提交于 2021-02-05 04:56:07
问题 In regular expression, i know when use \s to represent a space, but, in following case, would they be different: /a\sb/ ---with a \s /a b/ ---with empty field thanks a lot if you can explain to me. 回答1: The \s character class matches all "whitespace characters," not just spaces. This includes tabs (\t), and if multiline matching is allowed, it includes carriage return (\r) and newline (\n). Theoretically, if your regular expression engine handles unicode, there are also unicode whitespace

grep lines that start with a specific string

你。 提交于 2021-02-05 00:59:27
问题 I want to find all the lines in a file that start with a specific string. The problem is, I don't know what's in the string beforehand. The value is stored in a variable. The naïve solution would be the following: grep "^${my_string}" file.txt; Because if the Bash variable my_string contains ANY regular expression special characters, grep will cry, and everyone will have a bad day. You don't want to make grep cry, do you? 回答1: You should use awk instead of grep for non-regex search using