Why does this regex used in findText gobble up the entire text as if it is greedy?

问题

I can't for the life of me figure out why this regex is gobbling up the whole line in Google Docs. When I run this I can't get it to return just {{ClientName}}

Here is my text from my document.

{{ClientName}} would like to have a {{Product}} {{done/created}}. The purpose of this {{Product}} is to {{ProductPurpose}}. We have experience with such testing and development, and will develop and test the {{Product}} for {{ClientName}}.

function searchAndFind () {
     var foundText = DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getElement().asText().getText()
     return foundText
}

回答1:

Issue:

This is because findText() returns a RangeElement object, which provides methods for getting the full text Element as well as the offset of the actual matched text in the Element. When you use getElement(), you get the whole element instead of just the matched string.

Solution:

Get offsets from the range element to get the actual text in the element.

Code Snippet:

 function searchAndFind() {
  var rangeElement = DocumentApp.getActiveDocument()
    .getBody()
    .findText('{{([^,\\s]+)}}');

  return rangeElement
    .getElement()
    .asText()
    .getText()
    .substring(
      rangeElement.getStartOffset(),
      rangeElement.getEndOffsetInclusive()+1
    );
}

References:

FindText
RangeElement
String#Substring

回答2:

Try this:

function searchAndFind () {
  var foundElement = DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getElement().asText().getText();
  var start=DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getStartOffset();
  var end=DocumentApp.getActiveDocument().getBody().findText('\{\{([^,\s}{][a-zA-Z]+)\}\}').getEndOffsetInclusive();
  var foundText=foundElement.slice(start,end+1);
  Logger.log('\nfoundElement: %s\nstart: %s\nend: %s\nfoundText:%s\n',foundElement,start,end,foundText);
  return foundText;

Logger.log output:

[18-12-11 13:04:34:863 MST] 
foundElement: {{ClientName}} would like to have a {{Product}} {{done/created}}. The purpose of this {{Product}} is to {{ProductPurpose}}. We have experience with such testing and development, and will develop and test the {{Product}} for {{ClientName}}.
start: 0.0
end: 13.0
foundText:{{ClientName}}

回答3:

Regex is 'greedy' by default. You can make a quantifier (ie. +,?,* or {}) non-greedy by following the quantifier with ?.

For example:

x??
x*?
x+?
x{n}?
x{n,}?
x{n,m}?

Modify your regex to leverage this feature.

Check out the regex documentation on MDN and do a search (CTRL+F in chrome) for the term 'greedy' for more information.

来源：https://stackoverflow.com/questions/53730641/why-does-this-regex-used-in-findtext-gobble-up-the-entire-text-as-if-it-is-greed

标签

regex

google-apps-script

google-docs