What is the smartest way to handle robots.txt in Express?

旧巷老猫 提交于 2019-11-27 11:04:49

问题


I'm currently working on an application built with Express (Node.js) and I want to know what is the smartest way to handle different robots.txt for different environments (development, production).

This is what I have right now but I'm not convinced by the solution, I think it is dirty:

app.get '/robots.txt', (req, res) ->
  res.set 'Content-Type', 'text/plain'
  if app.settings.env == 'production'
    res.send 'User-agent: *\nDisallow: /signin\nDisallow: /signup\nDisallow: /signout\nSitemap: /sitemap.xml'
  else
    res.send 'User-agent: *\nDisallow: /'

(NB: it is CoffeeScript)

There should be a better way. How would you do it?

Thank you.


回答1:


Use a middleware function. This way the robots.txt will be handled before any session, cookieParser, etc:

app.use('/robots.txt', function (req, res, next) {
    res.type('text/plain')
    res.send("User-agent: *\nDisallow: /");
});

With express 4 app.get now gets handled in the order it appears so you can just use that:

app.get('/robots.txt', function (req, res) {
    res.type('text/plain');
    res.send("User-agent: *\nDisallow: /");
});



回答2:


  1. Create robots.txt with following content :

    User-agent: *
    Disallow:
    
  2. add it to public/ directory.

your robots.txt will be available to crawler at http://yoursite.com/robots.txt




回答3:


Looks like an ok way.

An alternative, if you'd like to be able to edit robots.txt as regular file, and possibly have other files you only want in production or development mode would be to use 2 separate directories, and activate one or the other at startup.

if (app.settings.env === 'production') {
  app.use(express['static'](__dirname + '/production'));
} else {
  app.use(express['static'](__dirname + '/development'));
}

then you add 2 directories with each version of robots.txt.

PROJECT DIR
    development
        robots.txt  <-- dev version
    production
        robots.txt  <-- more permissive prod version

And you can keep adding more files in either directory and keep your code simpler.

(sorry, this is javascript, not coffeescript)




回答4:


This is what I did on my index routes. You can just simply write down in your codes what I does given down below.

router.get('/', (req, res) =>
    res.sendFile(__dirname + '/public/sitemap.xml')
)

router.get('/', (req, res) => {
    res.sendFile(__dirname + '/public/robots.txt')
})



回答5:


For choosing the robots.txt depending the environment with a middleware way:

var env = process.env.NODE_ENV || 'development';

if (env === 'development' || env === 'qa') {
  app.use(function (req, res, next) {
    if ('/robots.txt' === req.url) {
      res.type('text/plain');
      res.send('User-agent: *\nDisallow: /');
    } else {
      next();
    }
  });
}


来源:https://stackoverflow.com/questions/15119760/what-is-the-smartest-way-to-handle-robots-txt-in-express

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!