问题
The following code doesn't output anything(why?).
#!/usr/bin/python
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("www.python.org" , 80))
print s.recv(4096)
s.close
What do I have to change in order to output the source code of the python website as you would see when you go to'view source' in a browser?
回答1:
HTTP is request/response protocol. You're not sending any request, thus you're not getting any response.
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("www.python.org" , 80))
s.sendall("GET /\r\n") # you're missing this line
print s.recv(4096)
s.close
Of course that will do the most raw HTTP/1.0 request, without handling HTTP errors, HTTP redirects, etc. I would not recommend it for actual usage beyond doing it as an exercise to familiarize yourself with socket programming and HTTP.
For HTTP Python provides few built in modules: httplib (bit lower level), urllib and urllib2 (high level ones).
回答2:
You'll get a redirect (302) unless you use the full URL in your request.
Try this instead:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("www.python.org" , 80))
s.sendall("GET http://www.python.org HTTP/1.0\n\n")
print s.recv(4096)
s.close()
Of course if you just want the content of a URL this is far simpler. :)
print urllib2.urlopen('http://www.python.org').read()
来源:https://stackoverflow.com/questions/10600235/a-python-socket-client-that-outputs-the-source-code-of-a-website-why-isnt-this