pcre

When does setting 'perl=TRUE' in 'strsplit' does not work (as intended or at all)?

强颜欢笑 提交于 2019-12-21 09:08:40
问题 I just did some benchmarking while trying to optimise some code and observed that strsplit with perl=TRUE is faster than running strsplit with perl=FALSE . For example, set.seed(1) ff <- function() paste(sample(10), collapse= " ") xx <- replicate(1e5, ff()) system.time(t1 <- strsplit(xx, "[ ]")) # user system elapsed # 1.246 0.002 1.268 system.time(t2 <- strsplit(xx, "[ ]", perl=TRUE)) # user system elapsed # 0.389 0.001 0.392 identical(t1, t2) # [1] TRUE So my question (or rather a variation

strsplit inconsistent with gregexpr

我的未来我决定 提交于 2019-12-21 07:41:03
问题 A comment on my answer to this question which should give the desired result using strsplit does not, even though it seems to correctly match the first and last commas in a character vector. This can be proved using gregexpr and regmatches . So why does strsplit split on each comma in this example, even though regmatches only returns two matches for the same regex? # We would like to split on the first comma and # the last comma (positions 4 and 13 in this string) x <- "123,34,56,78,90" #

centos7上安装mediawiki

别来无恙 提交于 2019-12-21 04:55:54
为了记录遇到过的坑,这里是基于 Apache2服务器 1、安装 apr-1.7.0 wget http://mirrors.tuna.tsinghua.edu.cn/apache//apr/apr-1.7.0.tar.gz tar xvf apr-1.7.0.tar.gz cd apr-1.7.0 ./configure --prefix=/usr/local/apr make && make install 2、安装apr-util-1.6.1 wget http://mirrors.tuna.tsinghua.edu.cn/apache//apr/apr-util-1.6.1.tar.gz tar xvf apr-util-1.6.1.tar.gz cd apr-util-1.6.1 ./configure --prefix=/usr/local/apr-util --with-apr=/usr/local/apr/bin/apr-1-config make && make install 3、安装pcre-8.43 wget https://sourceforge.net/projects/pcre/files/pcre/8.43/pcre-8.43.tar.gz tar zxvf pcre-8.43.tar.gz cd pcre ./configure --prefix=

Linker error LNK2038: mismatch detected in Release mode

≯℡__Kan透↙ 提交于 2019-12-20 10:17:43
问题 I am trying to port a small app of mine from Win XP and VS 2005 to Win 7 and VS 2010. The app compiles and runs smoothly in Debug mode, however in Release mode I get the following error: pcrecpp.lib(pcrecpp.obj) : error LNK2038: mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in LoginDlg.obj Where should I start checking? 回答1: Your app is being compiled in release mode, but you're linking against the debug version of PCRE, which had /MTd (or similar) set, thus

Splitting by a semicolon not surrounded by quote signs

旧城冷巷雨未停 提交于 2019-12-20 05:25:15
问题 Well, hello community. I'm workin' on a CSV decoder in PHP (yeah, I know there's already one, but as a challenge for me, since I'm learning it in my free time). Now the problem: Well, the rows are split up by PHP_EOL . In this line: foreach(explode($sep, $str) as $line) { where sep is the variable which splits up the rows and str the string I wanna decode. But if I wanna split up the columns by a semicolon there might be a situation where a semicolon is content of one column. And as I

Literal delimiter ( delimiter inside \Q \E block )

扶醉桌前 提交于 2019-12-20 04:45:08
问题 I've been trying to make a few of functions based on RegEx and most of them use \Q and \E as some of the RegEx pattern is user input. So, let's say hypothetically that we're using the delimiter / and want to match it against / the function would construct something amongst the lines of /\Q/\E/ . I'm not sure why /\Q/\E/ doesn't match / but with every other delimiter it does, unless you use the same delimiter as input . Maybe, it considers the delimiter the end, even though, it's in a literal

PCRE matching whole words in a string

廉价感情. 提交于 2019-12-19 19:13:52
问题 I'm trying to run a regexp in php ( preg_match_all ) that matches certain whole words in a string, but problem is that it also matches words that contain only part of a tested word. Also this is a sub-query in a larger regexp, so other PHP functions like strpos won't help me, sadly. String : "I test a string" Words to match : "testable", "string" Tried regexp : /([testable|string]+)/ Expected result : "string" only! Result : "test", "a", "string" 回答1: If you really want to make sure you only

PCRE matching whole words in a string

隐身守侯 提交于 2019-12-19 19:13:10
问题 I'm trying to run a regexp in php ( preg_match_all ) that matches certain whole words in a string, but problem is that it also matches words that contain only part of a tested word. Also this is a sub-query in a larger regexp, so other PHP functions like strpos won't help me, sadly. String : "I test a string" Words to match : "testable", "string" Tried regexp : /([testable|string]+)/ Expected result : "string" only! Result : "test", "a", "string" 回答1: If you really want to make sure you only

Why does is this end of line (\\b) not recognised as word boundary in stringr/ICU and Perl

醉酒当歌 提交于 2019-12-19 17:45:50
问题 Using stringr i tried to detect a € sign at the end of a string as follows: str_detect("my text €", "€\\b") # FALSE Why is this not working? It is working in the following cases: str_detect("my text a", "a\\b") # TRUE - letter instead of € grepl("€\\b", "2009in €") # TRUE - base R solution But it also fails in perl mode: grepl("€\\b", "2009in €", perl=TRUE) # FALSE So what is wrong about the €\\b -regex? The regex €$ is working in all cases... 回答1: When you use base R regex functions without

PCRE: backreferences not allowed in lookbehinds?

雨燕双飞 提交于 2019-12-19 17:41:18
问题 The PCRE regex /..(?<=(.)\1)/ fails to compile: "Subpattern references are not allowed within a lookbehind assertion." Interestingly it seems to be acceptable in lookaheads, like /(?=(.)\1)../ , just not in lookbehinds. Is there a technical reason why backreferences are not allowed in lookbehinds specifically? 回答1: With Python's re module, group references are not supported in lookbehind, even if they match strings of some fixed length. Lookbehinds doesn't fully support PCRE rules. Concretely