How do I unit test a module that relies on urllib2?

前端 未结 7 1218
孤街浪徒
孤街浪徒 2020-12-08 05:16

I\'ve got a piece of code that I can\'t figure out how to unit test! The module pulls content from external XML feeds (twitter, flickr, youtube, etc.) with urllib2. Here\'s

相关标签:
7条回答
  • 2020-12-08 05:19

    You can use pymox to mock the behavior of anything and everything in the urllib2 (or any other) package. It's 2010, you shouldn't be writing your own mock classes.

    0 讨论(0)
  • 2020-12-08 05:25

    I think the easiest thing to do is to actually create a simple web server in your unit test. When you start the test, create a new thread that listens on some arbitrary port and when a client connects just returns a known set of headers and XML, then terminates.

    I can elaborate if you need more info.

    Here's some code:

    import threading, SocketServer, time
    
    # a request handler
    class SimpleRequestHandler(SocketServer.BaseRequestHandler):
        def handle(self):
            data = self.request.recv(102400) # token receive
            senddata = file(self.server.datafile).read() # read data from unit test file
            self.request.send(senddata)
            time.sleep(0.1) # make sure it finishes receiving request before closing
            self.request.close()
    
    def serve_data(datafile):
        server = SocketServer.TCPServer(('127.0.0.1', 12345), SimpleRequestHandler)
        server.datafile = datafile
        http_server_thread = threading.Thread(target=server.handle_request())
    

    To run your unit test, call serve_data() then call your code that requests a URL that looks like http://localhost:12345/anythingyouwant.

    0 讨论(0)
  • 2020-12-08 05:32

    Build a separate class or module responsible for communicating with your external feeds.

    Make this class able to be a test double. You're using python, so you're pretty golden there; if you were using C#, I'd suggest either in interface or virtual methods.

    In your unit test, insert a test double of the external feed class. Test that your code uses the class correctly, assuming that the class does the work of communicating with your external resources correctly. Have your test double return fake data rather than live data; test various combinations of the data and of course the possible exceptions urllib2 could throw.

    Aand... that's it.

    You can't effectively automate unit tests that rely on external sources, so you're best off not doing it. Run an occasional integration test on your communication module, but don't include those tests as part of your automated tests.

    Edit:

    Just a note on the difference between my answer and @Crast's answer. Both are essentially correct, but they involve different approaches. In Crast's approach, you use a test double on the library itself. In my approach, you abstract the use of the library away into a separate module and test double that module.

    Which approach you use is entirely subjective; there's no "correct" answer there. I prefer my approach because it allows me to build more modular, flexible code, something I value. But it comes at a cost in terms of additional code to write, something that may not be valued in many agile situations.

    0 讨论(0)
  • 2020-12-08 05:32

    Trying to improve a bit on @john-la-rooy answer, I've made a small class allowing simple mocking for unit tests

    Should work with python 2 and 3

    try:
        import urllib.request as urllib
    except ImportError:
        import urllib2 as urllib
    
    from io import BytesIO
    
    
    class MockHTTPHandler(urllib.HTTPHandler):
    
        def mock_response(self, req):
            url = req.get_full_url()
    
            print("incomming request:", url)
    
            if url.endswith('.json'):
                resdata = b'[{"hello": "world"}]'
                headers = {'Content-Type': 'application/json'}
    
                resp = urllib.addinfourl(BytesIO(resdata), header, url, 200)
                resp.msg = "OK"
    
                return resp
            raise RuntimeError('Unhandled URL', url)
        http_open = mock_response
    
    
        @classmethod
        def install(cls):
            previous = urllib._opener
            urllib.install_opener(urllib.build_opener(cls))
            return previous
    
        @classmethod
        def remove(cls, previous=None):
            urllib.install_opener(previous)
    

    Used like this:

    class TestOther(unittest.TestCase):
    
        def setUp(self):
            previous = MockHTTPHandler.install()
            self.addCleanup(MockHTTPHandler.remove, previous)
    
    0 讨论(0)
  • 2020-12-08 05:38

    urllib2 has a functions called build_opener() and install_opener() which you should use to mock the behaviour of urlopen()

    import urllib2
    from StringIO import StringIO
    
    def mock_response(req):
        if req.get_full_url() == "http://example.com":
            resp = urllib2.addinfourl(StringIO("mock file"), "mock message", req.get_full_url())
            resp.code = 200
            resp.msg = "OK"
            return resp
    
    class MyHTTPHandler(urllib2.HTTPHandler):
        def http_open(self, req):
            print "mock opener"
            return mock_response(req)
    
    my_opener = urllib2.build_opener(MyHTTPHandler)
    urllib2.install_opener(my_opener)
    
    response=urllib2.urlopen("http://example.com")
    print response.read()
    print response.code
    print response.msg
    
    0 讨论(0)
  • It would be best if you could write a mock urlopen (and possibly Request) which provides the minimum required interface to behave like urllib2's version. You'd then need to have your function/method which uses it able to accept this mock urlopen somehow, and use urllib2.urlopen otherwise.

    This is a fair amount of work, but worthwhile. Remember that python is very friendly to ducktyping, so you just need to provide some semblance of the response object's properties to mock it.

    For example:

    class MockResponse(object):
        def __init__(self, resp_data, code=200, msg='OK'):
            self.resp_data = resp_data
            self.code = code
            self.msg = msg
            self.headers = {'content-type': 'text/xml; charset=utf-8'}
    
        def read(self):
            return self.resp_data
    
        def getcode(self):
            return self.code
    
        # Define other members and properties you want
    
    def mock_urlopen(request):
        return MockResponse(r'<xml document>')
    

    Granted, some of these are difficult to mock, because for example I believe the normal "headers" is an HTTPMessage which implements fun stuff like case-insensitive header names. But, you might be able to simply construct an HTTPMessage with your response data.

    0 讨论(0)
提交回复
热议问题