Regex to replace html whitespace and leading whitespace in notepad++

为君一笑 提交于 2019-12-22 11:24:03

问题


I have tried to use the following regex expression to remove html whitespace and leading whitespace

Find:   \s*([<>])\s*

Replace: $1

But each time that I do this I end up with 186 occurrences of $1 literaly in my document. Any assistance would be greatly appreciated

Here is an example of what I am talking about

This

<fieldset id="prod_desc">
<p>Original AA </p>
<b>Features:</b> 
<ul>
  <li>2 pole rectangular dome tent with 13.4 sq ft of vestibule storage </li>
  <li>Durable, shockcorded, self-supporting fiberglass frame and ring and pin/pole pocket assembly </li>
  <li>2 side opening door panels are constructed entirely of no see-um mesh to maximize air flow inside </li>
  <li>Poke-out vent in side wall allows the option of additional ventilation when needed </li>
  <li>2 interior storage pockets keep essential items handy Specifications: </li>
  <li>Season: 3 </li>
  <li>Sleeps: 2 </li>
  <li>Doors: 2 </li>
  <li>Windows: 2 </li>
  <li>Weight: 5 lbs 12 oz </li>
  <li>Area: 36.5 Sq. Ft. </li>
  <li>Center Height: 3' 7.5&quot;</li>
</ul>
</fieldset> 

should become:

<fieldset id="prod_desc"><p>Original AA</p><b>Features:</b><ul><li>2 pole rectangular dome tent with 13.4 sq ft of vestibule storage</li><li>Durable, shockcorded, self-supporting fiberglass frame and ring and pin/pole pocket assembly</li><li>2 side opening door panels are constructed entirely of no see-um mesh to maximize air flow inside</li><li>Poke-out vent in side wall allows the option of additional ventilation when needed</li><li>2 interior storage pockets keep essential items handy Specifications:</li><li>Season: 3</li><li>Sleeps: 2</li><li>Doors: 2</li><li>Windows: 2</li><li>Weight: 5 lbs 12 oz</li><li>Area: 36.5 Sq. Ft.</li><li>Center Height: 3' 7.5&quot;</li></ul></fieldset>

回答1:


Notepad++ doesn't support $1 for backreferences before version 6.0 when it introduced PCRE support for find-and-replace. For older versions, use \1 for backreferences.

You should be finding \s*(<[^>]+>)\s*. As of Notepad++ version 6.0, released in March 2012, this alone should work for you. I tried your original regex and it works as well, much to my surprise.

Previous versions cannot do multi-line regex replacements. To strip newlines, perform the regex replacement first, then do an extended find (UNIX line endings):

\n

For Windows line endings:

\r\n

Replace either case with nothing.




回答2:


You could use the expression \s+\<(.*)\>\s+ and replace with $1 (or \1 in Notepad++)

Or you could use this approach:

  • first, match \s+\< and replace with <
  • second, match \>\s+ and replace with >


来源:https://stackoverflow.com/questions/4683021/regex-to-replace-html-whitespace-and-leading-whitespace-in-notepad

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!