I\'m trying to create a set of Unit Tests to test the Google Client Library for Bigquery. I\'m struggling to make a Unittest file which will mock the client and will let me test
It took a fair amount of Googling, and trial and error, to figure out how to do this, and I just got it working, so I thought it was worth sharing.
unittest provides patch which allows you to mock a function at the point of use, ie. replace a Google API call in your code under test, and mock, which allows you to further customise the result of accessing attributes and calling functions on that mock.
The unittest docs explaining patching here:
https://docs.python.org/3/library/unittest.mock.html#where-to-patch
This does explain how it works, but the best explanation I found in order to understand how to do this properly is: http://alexmarandon.com/articles/python_mock_gotchas/
Here is a Python script to be tested, mocking_google.py, containing references to Google Storage and BigQuery APIs:
from google.cloud.bigquery import Client as bigqueryClient
from google.cloud.storage import Client as storageClient
def list_blobs():
storage_client = storageClient(project='test')
blobs = storage_client.list_blobs('bucket', prefix='prefix')
return blobs
def extract_table():
bigquery_client = bigqueryClient(project='test')
job = bigquery_client.extract_table('project.dataset.table_id', destination_uris='uri')
return job
Here is the unit test:
import pytest
from unittest.mock import Mock, patch
from src.data.mocking_google import list_blobs, extract_table
@pytest.fixture
def extract_result():
'Mock extract_job result with properties needed'
er = Mock()
er.return_value = 1
return er
@pytest.fixture
def extract_job(extract_result):
'Mock extract_job with properties needed'
ej = Mock()
ej.job_id = 1
ej.result.return_value = 2
return ej
@patch("src.data.mocking_google.storageClient")
def test_list_blobs(storageClient):
storageClient().list_blobs.return_value = [1,2]
blob_list = list_blobs()
storageClient().list_blobs.assert_called_with('bucket', prefix='prefix')
assert blob_list == [1,2]
@patch("src.data.mocking_google.bigqueryClient")
def test_extract_table(bigqueryClient,extract_job):
bigqueryClient().extract_table.return_value = extract_job
job = extract_table()
bigqueryClient().extract_table.assert_called_with('project.dataset.table_id', destination_uris='uri')
assert job.job_id == 1
assert job.result() == 2
Here is the test results:
pytest -v src/tests/data/test_mocking_google.py============================================================ test session starts =============================================================
platform darwin -- Python 3.7.6, pytest-5.3.5, py-1.8.1, pluggy-0.13.1 -- /Users/gaya/.local/share/virtualenvs/autoencoder-recommendation-copy-zpYZ6J1x/bin/python3
cachedir: .pytest_cache
rootdir: /Users/gaya/Documents/GitHub/mlops-autoencoder-recommendation, inifile: tox.ini
plugins: cov-2.8.1
collected 2 items
src/tests/data/test_mocking_google.py::test_list_blobs PASSED [ 50%]
src/tests/data/test_mocking_google.py::test_extract_table PASSED [100%]
============================================================= 2 passed in 1.14s ==============================================================
Happy to explain further if how this works is not clear :)