tags
I\'ve stumped myself trying to figure out how to remove carriage returns that occur between tags. (Technically I need to replace them with spaces, not
[\r\n]+(?=(?:[^<]+|<(?!/?p\b))*</p>)
The first part matches one or more of any kind of line separator (\n, \r\n, or \r). The rest is a lookahead that attempts to match everything up to the next closing </p> tag, but if it finds an opening <p> tag first, the match fails.
Note that this regex can be fooled very easily, for example by SGML comments, <script> elements, or plain old malformed HTML. Also, I'm assuming your regex flavor supports positive and negative lookaheads. That's a pretty safe assumption these days, but if the regex doesn't work for you, we'll need to know exactly which language or tool you're using.