Ignore urls in robot.txt with specific parameters?

前端 未结 3 1204
误落风尘
误落风尘 2020-12-02 16:41

I would like for google to ignore urls like this:

http://www.mydomain.com/new-printers?dir=asc&order=price&p=3

All urls that have the parameters dir,

相关标签:
3条回答
  • 2020-12-02 16:59

    Register your website with Google WebMaster Tools. There you can tell Google how to deal with your parameters.

    Site Configuration -> URL Parameters

    You should have the pages that contain those parameters indicate that they should be excluded from indexing via the robots meta tag. e.g.

    0 讨论(0)
  • 2020-12-02 17:07

    Here's a solutions if you want to disallow query strings:

    Disallow: /*?*
    

    or if you want to be more precise on your query string:

    Disallow: /*?dir=*&order=*&p=*
    

    You can also add to the robots.txt which url to allow

    Allow: /new-printer$
    

    The $ will make sure only the /new-printer will be allowed.

    More info:

    http://code.google.com/web/controlcrawlindex/docs/robots_txt.html

    http://sanzon.wordpress.com/2008/04/29/advanced-usage-of-robotstxt-w-querystrings/

    0 讨论(0)
  • 2020-12-02 17:09

    You can block those specific query string parameters with the following lines

    Disallow: /*?*dir=
    Disallow: /*?*order=
    Disallow: /*?*p=
    

    So if any URL contains dir=, order=, or p= anywhere in the query string, it will be blocked.

    0 讨论(0)
提交回复
热议问题