So I\'m looking into urllib3 because it has connection pooling and is thread safe (so performance is better, especially for crawling), but the documentation is... minimal to
Is there not a problem with multiple cookies?
Some servers return multiple Set-Cookie headers, but urllib3 stores the headers in a dict and a dict does not allow multiple entries with the same key.
httplib2 has a similar problem.
Or maybe not: it turns out that the readheaders method of the HTTPMessage class in the httplib package -- which both urllib3 and httplib2 use -- has the following comment:
If multiple header fields with the same name occur, they are combined according to the rules in RFC 2616 sec 4.2:
Appending each subsequent field-value to the first, each separated
by a comma. The order in which header fields with the same field-name
are received is significant to the interpretation of the combined
field value.
So no headers are lost.
There is, however, a problem if there are commas within a header value. I have not yet figured out what is going on here, but from skimming RFC 2616 ("Hypertext Transfer Protocol -- HTTP/1.1") and RFC 2965 ("HTTP State Management Mechanism") I get the impression that any commas within a header value are supposed to be quoted.