regex-lookarounds

Is this a bug in .NET's Regex.Split?

≯℡__Kan透↙ 提交于 2019-11-28 13:51:24
I have two regular expressions, for use with Regex.Split : (?<=\G[^,],[^,],) and (?<=\G([^,],){2}) When splitting the string "A,B,C,D,E,F,G," , the first one results in: A,B, C,D, E,F, G, and the second results in: A,B, A, C,D, C, E,F, E, G, What is going on here? I thought that (X){2} was always equivalent to XX , but I'm not sure anymore. In my actual problem, I need to do something like quite a bit more complex, and I need to do it sixty nine times, so just repeating the pattern is less than ideal. From the documentation for Regex.Split If capturing parentheses are used in a Regex.Split

replace characters in notepad++ BUT exclude characters inside single quotation marks(2nd)

不想你离开。 提交于 2019-11-28 11:34:38
问题 replace characters in notepad++ BUT exclude characters inside single quotation marks Sorry to all users (especially to Avinash Raj) who answered already 1st similiar question - I did simply forget the 2nd kind of string. (And (that is the sad thing) - I'm not able to adjust the solution from 1st similiar question to the 2nd kind of string...) I have TWO different strings in this kind: SELECT column_name FROM table_name WHERE column_name IN ('A' , 'st9u' ,'Meyer', ....); WHERE a.object_type IN

Does regex lookahead affect subsequent match?

♀尐吖头ヾ 提交于 2019-11-28 11:15:57
问题 I was playing around with regular expression look-aheads and came across something I don't understand. I would expect this regular expression: (?=1)x to match this string: "x1" But it doesn't. In ruby the code looks like: > "x1".match /(?=1)x/ => nil Here's what I would expect to happen: We start with the regular expression parser's cursor on "x". The regexp engine searches the string for "1" and gets a match. The cursor is still on "x" The regexp engine searches for "x" and finds it, since

Notepad++ use both regular expressions and extended search

╄→尐↘猪︶ㄣ 提交于 2019-11-28 06:30:04
问题 I need to find all \r\n that do not precede the letter M; Seems I can't do this: \r\n[^M] I can only do \r\n with extended search selected or [^M] with regular expressions selected; but not together. 回答1: You should instead use this regex: \R(?!M) Explanation: \R Any Unicode newline sequence. (?!M) Negative Lookahead : Assert "M" cannot be matched. 回答2: \r\n is valid with Regular expression checked in the Find tab too - i.e. not just with Extended checked: why not just use \r\n[^M] with

Regex to match certain characters and exclude certain characters but without negative lookahead

旧城冷巷雨未停 提交于 2019-11-28 05:50:30
问题 I want a regex that matches all emojis (or most of them) but excludes certain characters (such as “|”|‘|’|…|— ). This regex does the job via negative lookahead: /(?!\u201C|\u201D|\u2018|\u2019|\u2026|\u2014)(\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff])/ But apparently Google Scripts doesn't support this. Error: Invalid regular expression pattern (?!“|”|‘|’|…|—)(©|®|[ -㌀]|?[퀀-?]|?[퀀-?]|?[퀀-?]) Is there another way to achieve my goal (a regex

Java regex: Negative lookahead

徘徊边缘 提交于 2019-11-28 02:43:40
问题 I'm trying to craft two regular expressions that will match URIs. These URIs are of the format: /foo/someVariableData and /foo/someVariableData/bar/someOtherVariableData I need two regexes. Each needs to match one but not the other. The regexes I originally came up with are: /foo/.+ and /foo/.+/bar/.+ respectively. I think the second regex is fine. It will only match the second string. The first regex, however, matches both. So, I started playing around (for the first time) with negative

Understanding negative lookahead

喜夏-厌秋 提交于 2019-11-27 22:39:50
I'm trying to understand how negative lookaheads work on simple examples. For instance, consider the following regex: a(?!b)c I thought the negative lookahead matches a position. So, in that case the regex matches any string that contains strictly 3 characters and is not abc . But it's not true, as can be seen in this demo . Why? nu11p01n73R Lookaheads do not consume any characters. It just checks if the lookahead can be matched or not: a(?!b)c So here after matching a it just checks if it is followed not by b but does not consume that not character (which is c ) and is followed by c . How a(?

Need to split a string into two parts in java

做~自己de王妃 提交于 2019-11-27 19:30:20
问题 I have a string which contains a contiguous chunk of digits and then a contiguous chunk of characters. I need to split them into two parts (one integer part, and one string). I tried using String.split("\\D", 1) , but it is eating up first character. I checked all the String API and didn't find a suitable method. Is there any method for doing this thing? 回答1: Use lookarounds: str.split("(?<=\\d)(?=\\D)") String[] parts = "123XYZ".split("(?<=\\d)(?=\\D)"); System.out.println(parts[0] + "-" +

What are the differences between perl and java regex capabilities?

隐身守侯 提交于 2019-11-27 19:07:17
问题 What are the differences between perl and java with regard to what regular expression terms are supported? This question is isolated to just the regular expressions, and specifically excludes differences in how regex can be used - ie the functions/methods available that use regex - and syntactic differences between the languages such as the java requirement to escape backslashes etc. Of particular interest is the partial/occasional support java has for variable length look-behinds. 回答1: The

Regex expression not working with once or none

左心房为你撑大大i 提交于 2019-11-27 16:31:41
Below is my regex: [^4\d{3}-?\d{4}-?\d{4}-?\d{4}$] But it throws an error at - . I am using ? , which should allows - to appear zero or one time. Why it is giving errors? The problem with the regex is that the pattern is enclosed with [ and ] that are treated as character class markers (see Character Classes or Character Sets ): With a "character class", also called "character set", you can tell the regex engine to match only one out of several characters. Simply place the characters you want to match between square brackets. If you want to match an a or an e , use [ae] . In character classes,