I have a string in JavaScript and it includes an a
tag with an href
. I want to remove all links and the text. I know how to just r
This will strip out everything between <a
and /a>
:
mystr = "check this out <a href='http://www.google.com'>Click me</a>. cool, huh?";
alert(mystr.replace(/<a\b[^>]*>(.*?)<\/a>/i,""));
It's not really foolproof, but maybe it'll do the trick for your purpose...
Just commented about John Resig's HTML parser. Maybe it helps on your problem.
Just to clarify, in order to strip link tags and leave everything between them untouched, it is a two step process - remove the opening tag, then remove the closing tag.
txt.replace(/<a\b[^>]*>/i,"").replace(/<\/a>/i, "");
Working sample:
<script>
function stripLink(txt) {
return txt.replace(/<a\b[^>]*>/i,"").replace(/<\/a>/i, "");
}
</script>
<p id="strip">
<a href="#">
<em>Here's the text!</em>
</a>
</p>
<p>
<input value="Strip" type="button" onclick="alert(stripLink(document.getElementById('strip').innerHTML))">
</p>
Examples above do not remove all occurrences. Here is my solution:
str.replace(/<a\b[^>]*>/gm, '').replace(/<\/a>/gm, '')
If you only want to remove <a>
elements, the following should work well:
s.replace(/<a [^>]+>[^<]*<\/a>/, '');
This should work for the example you gave, but it won't work for nested tags, for example it wouldn't work with this HTML:
<a href="http://www.google.com"><em>Google</em></a>
Regexes are fundamentally bad at parsing HTML (see Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why). What you need is an HTML parser. See Can you provide an example of parsing HTML with your favorite parser? for examples using a variety of parsers.