How to download all files (but not HTML) from a website using wget?

前端 未结 8 981
生来不讨喜
生来不讨喜 2020-11-29 14:19

How to use wget and get all the files from website?

I need all files except the webpage files like HTML, PHP, ASP etc.

相关标签:
8条回答
  • 2020-11-29 15:08
    wget -m -A * -pk -e robots=off www.mysite.com/
    

    this will download all type of files locally and point to them from the html file and it will ignore robots file

    0 讨论(0)
  • 2020-11-29 15:11

    You may try:

    wget --user-agent=Mozilla --content-disposition --mirror --convert-links -E -K -p http://example.com/
    

    Also you can add:

    -A pdf,ps,djvu,tex,doc,docx,xls,xlsx,gz,ppt,mp4,avi,zip,rar
    

    to accept the specific extensions, or to reject only specific extensions:

    -R html,htm,asp,php
    

    or to exclude the specific areas:

    -X "search*,forum*"
    

    If the files are ignored for robots (e.g. search engines), you've to add also: -e robots=off

    0 讨论(0)
提交回复
热议问题