I\'m looking a regular expression which must extract text between HTML tag of different types.
For ex:
Span 1
- O/p:
Your comment shows that you have neglected to escape the backslashes in your regex string.
And if you want to match lowercase letters add a-z
to the character classes or use Pattern.CASE_INSENSITIVE
(or add (?i)
to the beginning of the regex)
"<([A-Za-z][A-Za-z0-9]*)\\b[^>]*>(.*?)\\1>"
If the tag contents may contain newlines, then use Pattern.DOTALL
or add (?s)
to the beginning of the regex to turn on dotall/singleline mode.