pcre | 易学教程

When does setting 'perl=TRUE' in 'strsplit' does not work (as intended or at all)?

阅读更多关于 When does setting 'perl=TRUE' in 'strsplit' does not work (as intended or at all)?

问题 I just did some benchmarking while trying to optimise some code and observed that strsplit with perl=TRUE is faster than running strsplit with perl=FALSE . For example, set.seed(1) ff <- function() paste(sample(10), collapse= " ") xx <- replicate(1e5, ff()) system.time(t1 <- strsplit(xx, "[ ]")) # user system elapsed # 1.246 0.002 1.268 system.time(t2 <- strsplit(xx, "[ ]", perl=TRUE)) # user system elapsed # 0.389 0.001 0.392 identical(t1, t2) # [1] TRUE So my question (or rather a variation

strsplit inconsistent with gregexpr

阅读更多关于 strsplit inconsistent with gregexpr

问题 A comment on my answer to this question which should give the desired result using strsplit does not, even though it seems to correctly match the first and last commas in a character vector. This can be proved using gregexpr and regmatches . So why does strsplit split on each comma in this example, even though regmatches only returns two matches for the same regex? # We would like to split on the first comma and # the last comma (positions 4 and 13 in this string) x <- "123,34,56,78,90" #

centos7上安装mediawiki

阅读更多关于 centos7上安装mediawiki

为了记录遇到过的坑，这里是基于 Apache2服务器 1、安装 apr-1.7.0 wget http://mirrors.tuna.tsinghua.edu.cn/apache//apr/apr-1.7.0.tar.gz tar xvf apr-1.7.0.tar.gz cd apr-1.7.0 ./configure --prefix=/usr/local/apr make && make install 2、安装apr-util-1.6.1 wget http://mirrors.tuna.tsinghua.edu.cn/apache//apr/apr-util-1.6.1.tar.gz tar xvf apr-util-1.6.1.tar.gz cd apr-util-1.6.1 ./configure --prefix=/usr/local/apr-util --with-apr=/usr/local/apr/bin/apr-1-config make && make install 3、安装pcre-8.43 wget https://sourceforge.net/projects/pcre/files/pcre/8.43/pcre-8.43.tar.gz tar zxvf pcre-8.43.tar.gz cd pcre ./configure --prefix=

Linker error LNK2038: mismatch detected in Release mode

阅读更多关于 Linker error LNK2038: mismatch detected in Release mode

问题 I am trying to port a small app of mine from Win XP and VS 2005 to Win 7 and VS 2010. The app compiles and runs smoothly in Debug mode, however in Release mode I get the following error: pcrecpp.lib(pcrecpp.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in LoginDlg.obj Where should I start checking? 回答1: Your app is being compiled in release mode, but you're linking against the debug version of PCRE, which had /MTd (or similar) set, thus

Splitting by a semicolon not surrounded by quote signs

阅读更多关于 Splitting by a semicolon not surrounded by quote signs

问题 Well, hello community. I'm workin' on a CSV decoder in PHP (yeah, I know there's already one, but as a challenge for me, since I'm learning it in my free time). Now the problem: Well, the rows are split up by PHP_EOL . In this line: foreach(explode($sep, $str) as $line) { where sep is the variable which splits up the rows and str the string I wanna decode. But if I wanna split up the columns by a semicolon there might be a situation where a semicolon is content of one column. And as I

Literal delimiter ( delimiter inside \Q \E block )

阅读更多关于 Literal delimiter ( delimiter inside \Q \E block )

问题 I've been trying to make a few of functions based on RegEx and most of them use \Q and \E as some of the RegEx pattern is user input. So, let's say hypothetically that we're using the delimiter / and want to match it against / the function would construct something amongst the lines of /\Q/\E/ . I'm not sure why /\Q/\E/ doesn't match / but with every other delimiter it does, unless you use the same delimiter as input . Maybe, it considers the delimiter the end, even though, it's in a literal

PCRE matching whole words in a string

阅读更多关于 PCRE matching whole words in a string

问题 I'm trying to run a regexp in php ( preg_match_all ) that matches certain whole words in a string, but problem is that it also matches words that contain only part of a tested word. Also this is a sub-query in a larger regexp, so other PHP functions like strpos won't help me, sadly. String : "I test a string" Words to match : "testable", "string" Tried regexp : /([testable|string]+)/ Expected result : "string" only! Result : "test", "a", "string" 回答1: If you really want to make sure you only

PCRE matching whole words in a string

阅读更多关于 PCRE matching whole words in a string

Why does is this end of line (\\b) not recognised as word boundary in stringr/ICU and Perl

阅读更多关于 Why does is this end of line (\\b) not recognised as word boundary in stringr/ICU and Perl

问题 Using stringr i tried to detect a € sign at the end of a string as follows: str_detect("my text €", "€\\b") # FALSE Why is this not working? It is working in the following cases: str_detect("my text a", "a\\b") # TRUE - letter instead of € grepl("€\\b", "2009in €") # TRUE - base R solution But it also fails in perl mode: grepl("€\\b", "2009in €", perl=TRUE) # FALSE So what is wrong about the €\\b -regex? The regex €$ is working in all cases... 回答1: When you use base R regex functions without

PCRE: backreferences not allowed in lookbehinds?

阅读更多关于 PCRE: backreferences not allowed in lookbehinds?

问题 The PCRE regex /..(?<=(.)\1)/ fails to compile: "Subpattern references are not allowed within a lookbehind assertion." Interestingly it seems to be acceptable in lookaheads, like /(?=(.)\1)../ , just not in lookbehinds. Is there a technical reason why backreferences are not allowed in lookbehinds specifically? 回答1: With Python's re module, group references are not supported in lookbehind, even if they match strings of some fixed length. Lookbehinds doesn't fully support PCRE rules. Concretely