Escaping Angled Bracket acts similar to look-ahead

让人想犯罪 __ 提交于 2019-12-20 01:45:38

问题


Why does escaping escaping the angled bracket > exhibit the look-ahead like behavior?

To be clear, I understand that the angled bracket does not necessitate being escaped.
The question is, how is the pattern being interpreted that it yields the match(es) shown

## match bracket, with or without underscore
## replace with "greater_"
strings <- c("ten>eight", "ten_>_eight")
repl    <- "greater_"

## Unescaped. Yields desired results
gsub(">_?", repl, strings)
#  [1] "tengreater_eight"  "ten_greater_eight"

## All four of these yield the same result
gsub("\\>_?",   repl, strings)  # (a)
gsub("\\>(_?)", repl, strings)  # (b)
gsub("\\>(_)?", repl, strings)  # (c)
gsub("\\>?",    repl, strings)  # (d)
gsub("\\>",     repl, strings)  # (e)
#  [1] "tengreater_>eightgreater_"   "ten_greater_>_eightgreater_"

gregexpr("\\>?", strings)

Some follow up questions:

1.  Why do `(a)` and `(d)` yield the same result? 
2.  Why is the end-of-string matched?
3.  Why do none of `a, b, or c` match the underscore? 

回答1:


\\> is a word boundary Which matches between a word character(in the left side) and a non-word character (in the right side) or end of the line anchor $.

> strings <- c("ten>eight", "ten_>_eight")
> gsub("\\>", "greater_", strings)
[1] "tengreater_>eightgreater_"   "ten_greater_>_eightgreater_"

In the above example it match only the word boundary exists between a word character after n and a non-word character > then also the boundary between t and end of the line anchor in the first element. And it matches between _ (also a word character) and > then between t and end of the line anchor (ie, $) in the second element. Finally it replaces the matched boundaries with the string you specified.

A simple example:

> gsub("\\>", "*", "f:r(:")
[1] "f*:r*(:"

Consider the below input string. (w means a word character, N means a non-word character)

    f:r(:
w___|||||
     |w|N
     N |
       |
       N

So \\> matches between,

  • f and :
  • r and (

Example 2:

> gsub("\\>", "*", "f") 
[1] "f*"

Input string:

f$
||----End of the line anchor
w

Replacing the matched boundary with * will give the above result.



来源:https://stackoverflow.com/questions/26237735/escaping-angled-bracket-acts-similar-to-look-ahead

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!