NodeJs mirror website proxy

半世苍凉 提交于 2020-03-03 07:15:20

问题


How would you write a server that simply mirrored a website when a request was received? For example, hitting http://localhost:5000 which is running NodeJS would render cnn.com with images and everything. Is this called a passthrough proxy?

I'm not looking for something that requires configuring an actual proxy within your browser settings, but instead just serves up essentially a mirror of another site by passing the requests through.


回答1:


First, let me make sure I understand your question.

You want to have your users browse to http://mynodeproxy.example.com and have that page in their browser render as if it was http://cnn.com. Right?

The answer is: You can't do it the way you think you can. This is possible with 2 approaches:

  1. Users configure a real proxy server in their browser settings (this is why all browsers support configuring a proxy server). You could use an existing proxy server or try to write your own with node and some specialized application logic. But the point is the user's don't type your proxy address into the browser's address bar. They type your proxy address into their browser settings "proxy server" field and still type "http://cnn.com" into their browser address bar.

  2. If you control all outgoing traffic from your network, you can do hotel-style tricks like DNS hijacking or routing all traffic through your proxy.

But this won't work by having your users put your passthrough proxy server address in their browser's address bar because the HTML your proxy gets from CNN.com is going to have hyperlinks back to other cnn.com resources (other pages on the site, images, fonts, CSS, JS, etc). If those links include the hostname instead of being relative to the containing HTML document, the browser will connect directly to cnn.com to load them, bypassing your proxy.

Now imagine the CNN HTML has a link like <a href="http://cnn.com">View the CNN Home Page</a>. What happens when the user clicks that? That's right, your proxy is entirely out of the picture and bypasses. This is why proxy servers work with explicit browser support.

Once CNN.com's javascript starts doing things like making ajax requests, dynamically adding stuff to the DOM, etc, you will see this is not possible by simply proxying and modifying the initial cnn.com home page HTML. Yes, you could do this for an extremely trivial contrived example web page, but realistically a modern popular site like cnn.com, it's not feasible.



来源:https://stackoverflow.com/questions/23234936/nodejs-mirror-website-proxy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!