Use HTML Tidy to just indent HTML code?

后端 未结 6 1763
离开以前
离开以前 2020-12-13 10:03

Is it possible to use HTML Tidy to just indent HTML code?

Sample Code

相关标签:
6条回答
  • 2020-12-13 10:10

    I didn't found a possibility "only reindent - without any changes". The next config file will "repair" as low as possible and (mostly) only re-indent the html. Tidy still correcting some errorish conditions, like duplicated (repeated) attributes.

    #based on http://tidy.sourceforge.net/docs/quickref.html
    #HTML, XHTML, XML Options Reference
    anchor-as-name: no  #?
    doctype: omit
    drop-empty-paras: no
    fix-backslash: no
    fix-bad-comments: no
    fix-uri:no
    hide-endtags: yes   #?
    #input-xml: yes     #?
    join-styles: no
    literal-attributes: yes
    lower-literals: no
    merge-divs: no
    merge-spans: no
    output-html: yes
    preserve-entities: yes
    quote-ampersand: no
    quote-nbsp: no
    show-body-only: auto
    
    #Diagnostics Options Reference
    show-errors: 0
    show-warnings: 0
    
    #Pretty Print Options Reference
    break-before-br: yes
    indent: yes
    indent-attributes: no   #default
    indent-spaces: 4
    tab-size: 4
    wrap: 132
    wrap-asp: no
    wrap-jste: no
    wrap-php: no
    wrap-sections: no
    
    #Character Encoding Options Reference
    char-encoding: utf8
    
    #Miscellaneous Options Reference
    force-output: yes
    quiet: yes
    tidy-mark: no
    

    For example the next html-fragment

    <div>
    <div>
    <p>
    not closed para
    <h1>
    h1 head
    </h1>
    <ul>
    <li>not closed li
    <li>closed li</li>
    </ul>
    some text
    </div>
    </div>
    

    will changed to

    <div>
        <div>
            <p>
                not closed para
            <h1>
                h1 head
            </h1>
            <ul>
                <li>not closed li
                <li>closed li
                </ul>some text
        </div>
    </div>
    

    As you can notice, the hide-endtags: yes hides the closing </li> from the second bullet in the input. Setting the hide-endtags: no - will get the next:

    <div>
        <div>
            <p>
                not closed para
            </p>
            <h1>
                h1 head
            </h1>
            <ul>
                <li>not closed li
                </li>
                <li>closed li
                </li>
            </ul>some text
        </div>
    </div>
    

    so, tidy adds closing </p> and closing </li> to first bullet.

    I didn't found a possibility preserve everything on input and only reindent the file.

    0 讨论(0)
  • 2020-12-13 10:14

    You need the following option:

    tidy --show-body-only yes -i 4 -w 80 -m file.html
    

    http://tidy.sourceforge.net/docs/quickref.html#show-body-only

    -i 4 - indents 4 spaces (EDIT: tidy never uses tabs)
    or
    --indent-with-tabs yes - instead (--tab-size may affect wrapping)

    -w 80 - wrap at column 80 (default on my system: 68, very narrow)

    -m - modify file inplace

    (you may want to leave out the last option, and examine the output first)

    Showing only body, will naturally leave out the tidy-mark (generator meta).

    Another cool options are: --quiet yes - doesn't print W3C advertisements and other unnecessary output (errors still reported)

    0 讨论(0)
  • 2020-12-13 10:19

    Use the indent, tidy-mark, and quiet options:

    tidy \
      -indent \
      --indent-spaces 2 \
      -quiet \
      --tidy-mark no \
      index.html
    

    Or, using a config file rather than command-line options:

    indent: auto
    indent-spaces: 2
    quiet: yes
    tidy-mark: no
    

    Name it tidy_config.txt and save it the same directory as the .html file. Run it like this:

    tidy -config tidy_config.txt index.html
    

    For more customization, use the tidy man page to find other relevant options such as markup: no or force-output: yes.

    0 讨论(0)
  • 2020-12-13 10:22

    To answer the poster's original question, using Tidy to just indent HTML code, here's what I use:

    tidy --indent auto --quiet yes --show-body-only auto --show-errors 0 --wrap 0 input.html

    input.html

    <form action="?" method="get" accept-charset="utf-8">
    
    <ul>
    <li>
    <label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q" />
    </li>
    <li><input class="submit" type="submit" value="Search" /></li>
    </ul>
    
    
    </form>
    

    Output:

    <form action="?" method="get" accept-charset="utf-8">
      <ul>
        <li><label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q"></li>
        <li><input class="submit" type="submit" value="Search"></li>
      </ul>
    </form>
    

    No extra HTML code added. Errors are suppressed. To find out what each option does, it's best to refer to the official reference.

    0 讨论(0)
  • 2020-12-13 10:27

    I am very late to the party :)

    But in your tidy config file set

    tidy-mark: no

    by default this is set to yes.

    Once done, tidy will not add meta generator tag to your html.

    0 讨论(0)
  • 2020-12-13 10:30

    If you'd like to simply format whatever html you receive, ignore errors and indent the code nicely this is a good one liner using tidy

    tidy --show-body-only yes -i 4 -w 80 -m -quiet --force-output y -wrap 0 2>/dev/null
    

    You can use it with curl too

    curl -s someUrl | tidy --show-body-only yes -i 4 -w 80 -m -quiet --force-output y -wrap 0 2>/dev/null
    
    0 讨论(0)
提交回复
热议问题