问题
I am trying to use this code to create an HTTP proxy cache server. When I run the code it starts running and connects to the port and everything but when I try to connect from the browser, for example, it opens a port on 55555 if I type in localhost:52523/www.google.com it works fine but when I try other sites specifically HTTP, for example, localhost:52523/www.microcenter.com or just localhost:52523/google.com it will display localhost didn’t send any data. ERR_EMPTY_RESPONSE and shows an exception in the console though it creates the cache file on my computer.
I would like to find out how to edit the code so that I can access any website just as I would normally do on the browser without using the proxy server. It should be able to work with www.microcenter.com
import socket
import sys
import urllib
from urlparse import urlparse
Serv_Sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # socket.socket
function creates a socket.
port = Serv_Sock.getsockname()[1]
# Server socket created, bound and starting to listen
Serv_Sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # socket.socket
function creates a socket.
Serv_Sock.bind(('',port))
Serv_Sock.listen(5)
port = Serv_Sock.getsockname()[1]
# Prepare a server socket
print ("starting server on port %s...,"%(port))
def caching_object(splitMessage, Cli_Sock):
#this method is responsible for caching
Req_Type = splitMessage[0]
Req_path = splitMessage[1]
Req_path = Req_path[1:]
print "Request is ", Req_Type, " to URL : ", Req_path
#Searching available cache if file exists
url = urlparse(Req_path)
file_to_use = "/" + Req_path
print file_to_use
try:
file = open(file_to_use[5:], "r")
data = file.readlines()
print "File Present in Cache\n"
#Proxy Server Will Send A Response Message
#Cli_Sock.send("HTTP/1.0 200 OK\r\n")
#Cli_Sock.send("Content-Type:text/html")
#Cli_Sock.send("\r\n")
#Proxy Server Will Send Data
for i in range(0, len(data)):
print (data[i])
Cli_Sock.send(data[i])
print "Reading file from cache\n"
except IOError:
print "File Doesn't Exists In Cache\n fetching file from server \n
creating cache"
serv_proxy = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host_name = Req_path
print "HOST NAME:", host_name
try:
serv_proxy.connect((url.host_name, 80))
print 'Socket connected to port 80 of the host'
fileobj = serv_proxy.makefile('r', 0)
fileobj.write("GET " + "http://" + Req_path + " HTTP/1.0\n\n")
# Read the response into buffer
buffer = fileobj.readlines()
# Create a new file in the cache for the requested file.
# Also send the response in the buffer to client socket
# and the corresponding file in the cache
tmpFile = open(file_to_use, "wb")
for data in buffer:
tmpFile.write(data)
tcpCliSock.send(data)
except:
print 'Illegal Request'
Cli_Sock.close()
while True:
# Start receiving data from the client
print 'Initiating server... \n Accepting connection\n'
Cli_Sock, addr = Serv_Sock.accept() # Accept a connection from client
#print addr
print ' connection received from: ', addr
message = Cli_Sock.recv(1024) #Recieves data from Socket
splitMessage = message.split()
if len(splitMessage) <= 1:
continue
caching_object(splitMessage, Cli_Sock)
回答1:
Your errors are not related to URI scheme (http or https) but to files and socket use.
When you are trying to open a file with:
file = open(file_to_use[1:], "r")
you are passing an illegal file path (http://ebay.com/ in your example).
As you are working with URIs, you could use a parser like urlparse, so you can handle better the schema, hostname, etc...
For example:
url = urlparse(Req_path)
file_to_use = url.hostname
file = open(file_to_use, "r")
and use only the hostname as a file name.
Another problem is with the use of sockets. Function connect
should receive hostname, not hostname with schema which is what you are doing. Again, with the help of the parser:
serv_proxy.connect((url.hostname, 80))
Besides that, you do not call listen
on a client (see examples), so you can remove that line.
Finally, again to create the new file, use the hostname:
tmpFile = open(file_to_use, "wb")
来源:https://stackoverflow.com/questions/47062396/http-proxy-server-only-working-for-https-sites