Why does this regex used in findText gobble up the entire text as if it is greedy?

霸气de小男生 提交于 2020-11-29 23:56:38

问题


I can't for the life of me figure out why this regex is gobbling up the whole line in Google Docs. When I run this I can't get it to return just {{ClientName}}

Here is my text from my document.

{{ClientName}} would like to have a {{Product}} {{done/created}}. The purpose of this {{Product}} is to {{ProductPurpose}}. We have experience with such testing and development, and will develop and test the {{Product}} for {{ClientName}}.

function searchAndFind () {
     var foundText = DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getElement().asText().getText()
     return foundText
}

回答1:


Issue:

This is because findText() returns a RangeElement object, which provides methods for getting the full text Element as well as the offset of the actual matched text in the Element. When you use getElement(), you get the whole element instead of just the matched string.

Solution:

Get offsets from the range element to get the actual text in the element.

Code Snippet:

 function searchAndFind() {
  var rangeElement = DocumentApp.getActiveDocument()
    .getBody()
    .findText('{{([^,\\s]+)}}');

  return rangeElement
    .getElement()
    .asText()
    .getText()
    .substring(
      rangeElement.getStartOffset(),
      rangeElement.getEndOffsetInclusive()+1
    );
}

References:

  • FindText
  • RangeElement
  • String#Substring



回答2:


Try this:

function searchAndFind () {
  var foundElement = DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getElement().asText().getText();
  var start=DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getStartOffset();
  var end=DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getEndOffsetInclusive();
  var foundText=foundElement.slice(start,end+1);
  Logger.log('\nfoundElement: %s\nstart: %s\nend: %s\nfoundText:%s\n',foundElement,start,end,foundText);
  return foundText;

Logger.log output:

[18-12-11 13:04:34:863 MST] 
foundElement: {{ClientName}} would like to have a {{Product}} {{done/created}}. The purpose of this {{Product}} is to {{ProductPurpose}}. We have experience with such testing and development, and will develop and test the {{Product}} for {{ClientName}}.
start: 0.0
end: 13.0
foundText:{{ClientName}}



回答3:


Regex is 'greedy' by default. You can make a quantifier (ie. +,?,* or {}) non-greedy by following the quantifier with ?.

For example:

  • x??
  • x*?
  • x+?
  • x{n}?
  • x{n,}?
  • x{n,m}?

Modify your regex to leverage this feature.

Check out the regex documentation on MDN and do a search (CTRL+F in chrome) for the term 'greedy' for more information.



来源:https://stackoverflow.com/questions/53730641/why-does-this-regex-used-in-findtext-gobble-up-the-entire-text-as-if-it-is-greed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!