How to find if String contains html data?

后端 未结 7 2340
遇见更好的自我
遇见更好的自我 2020-12-15 18:13

How do I find if a string contains HTML data or not? The user provides input via web interface and it\'s quite possible he could have used either a simple text or used HTML

7条回答
  •  离开以前
    2020-12-15 19:11

    I know this is an old question but I ran into it and was looking for something more comprehensive that could detect things like HTML entities and would ignore other uses of < and > symbols. I came up with the following class that works well.

    You can play with it live at http://ideone.com/HakdHo

    I also uploaded this to GitHub with a bunch of JUnit tests.

    package org.github;
    
    /**
     * Detect HTML markup in a string
     * This will detect tags or entities
     *
     * @author dbennett455@gmail.com - David H. Bennett
     *
     */
    
    import java.util.regex.Pattern;
    
    public class DetectHtml
    {
        // adapted from post by Phil Haack and modified to match better
        public final static String tagStart=
            "\\<\\w+((\\s+\\w+(\\s*\\=\\s*(?:\".*?\"|'.*?'|[^'\"\\>\\s]+))?)+\\s*|\\s*)\\>";
        public final static String tagEnd=
            "\\";
        public final static String tagSelfClosing=
            "\\<\\w+((\\s+\\w+(\\s*\\=\\s*(?:\".*?\"|'.*?'|[^'\"\\>\\s]+))?)+\\s*|\\s*)/\\>";
        public final static String htmlEntity=
            "&[a-zA-Z][a-zA-Z0-9]+;";
        public final static Pattern htmlPattern=Pattern.compile(
          "("+tagStart+".*"+tagEnd+")|("+tagSelfClosing+")|("+htmlEntity+")",
          Pattern.DOTALL
        );
    
        /**
         * Will return true if s contains HTML markup tags or entities.
         *
         * @param s String to test
         * @return true if string contains HTML
         */
        public static boolean isHtml(String s) {
            boolean ret=false;
            if (s != null) {
                ret=htmlPattern.matcher(s).find();
            }
            return ret;
        }
    
    }
    

提交回复
热议问题