How to exclude links using POST parameters with wget

拈花ヽ惹草 提交于 2019-12-24 23:02:29

问题


I want to download all accessible html files under www.site.com/en/. However, there are a lot of linked URLS with post parameters on the site (e.g. pages 1,2,3.. for each product category). I want wget NOT to download these links. I'm using

-R "*\?*"

But it's not perfect because it only removes the file after downloading it.

Is there some way for example to filter the links followed by wget with a regex?


回答1:


It is possible to avoid those files with a regex, you would have to use --reject-regex '(.*)\?(.*)' but it will work only with wget version 1.15, so I would recommend you to check your wget version first.



来源:https://stackoverflow.com/questions/24820412/how-to-exclude-links-using-post-parameters-with-wget

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!