How to validate that HTML matches W3C standards

匿名 (未验证) 提交于 2019-12-03 03:10:03

问题:

I have a project that generates HTML pages using a Velocity template and Java. But most of the pages do not comply with W3C standards. How can I validate those HTML pages and get a log telling me what errors/warnings on what pages?

Then I can fix the errors manually. I have tried JTidyFilter, but that doesn't work for me.

回答1:

You can use the W3C validator directly from Java, see w3c-jabi.



回答2:

There is also an experimental API available from W3C to help automate validation. They kindly ask that you throttle requests, and also offer instructions on setting up a validator on a local server. It's definitely more work, but if you're generating a lot of HTML pages, it would probably make sense to also automate the validation.

http://validator.w3.org/docs/api.html



回答3:

The official API at

allows to call a local or remote W3C Checker via Markup Validator Web Service API since 2007.

has a single Java Class solution using Jersey and moxy-Jaxb to read in the SOAP response.

this is the maven dependency to use it:

<dependency>   <groupId>com.bitplan</groupId>   <artifactId>w3cValidator</artifactId>   <version>0.0.2</version> </dependency> 

Here is a Junit Test for trying it:

/**  * the URL of the official W3C Markup Validation service  * if you'd like to run the tests against your own installation you might want to modify this  */ public static final String url="http://validator.w3.org/check";  /**  * test the w3cValidator interface with some html code  * @throws Exception  */ @Test public void testW3CValidator() throws Exception {     String preamble="<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\"\n" +              "   \"http://www.w3.org/TR/html4/loose.dtd\">\n"+             "<html>\n"+             "  <head>\n"+             "    <meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">\n"+             "    <title>test</title>\n"+             "  </head>\n"+             "  <body>\n";     String footer="  </body>\n"+             "</html>\n";     String[] htmls = {             preamble+             "    <div>\n"+             footer,             "<!DOCTYPE html><html><head><title>test W3CChecker</title></head><body><div></body></html>"     };     int[] expectedErrs={1,2};     int[] expectedWarnings={1,2};     int index=0;     System.out.println("Testing "+htmls.length+" html messages via "+url);     for (String html : htmls) {         W3CValidator checkResult = W3CValidator.check(url, html);         List<ValidationError> errlist = checkResult.body.response.errors.errorlist;         List<ValidationWarning> warnlist = checkResult.body.response.warnings.warninglist;         Object first = errlist.get(0);         assertTrue("if first is a string, than moxy is not activated",first instanceof ValidationError);         //System.out.println(first.getClass().getName());         //System.out.println(first);         System.out.println("Validation result for test "+(index+1)+":");         for (ValidationError err:errlist) {             System.out.println("\t"+err.toString());         }         for (ValidationWarning warn:warnlist) {             System.out.println("\t"+warn.toString());         }         System.out.println();         assertTrue(errlist.size()>=expectedErrs[index]);         assertTrue(warnlist.size()>=expectedWarnings[index]);         index++;     } } // testW3CValidator 

shows how to run your on w3c validator on an Ubuntu Linux system.



回答4:

After extensive research and a little bit code hack, I've managed to use JTidyFilter in my project, and it is working beautifully now. JTidyFilter is in JTidyServlet which is a sub-project of JTidy written about five years ago. Recently they've updated the codes to comply with Java 5 compiler. I downloaded their codes, upgraded some dependencies and most importantly, changed some lines in the JTidyFilter class which handles the filter and finally got it work nicely in my project.

There are still some issues in reformatting the HTML, because I can see one or two errors when I use the Firefox HTML validation plugin, but otherwise most pages pass the validation.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!