I have am using pytest to test a web scraper that pushes the data to a database. The class only pulls the html and pushes the html to a database to be p
There's a couple ways to handle this, but I'll go over two common approaches I've seen in Python baselines.
1) Separate your tests by putting the "optional" tests in another directory.
Not sure what your project layout looks like, but you can do something like this (only the test directory is important, the rest is just a toy example layout):
README.md
setup.py
requirements.txt
test/
unit/
test_something.py
test_something_else.py
integration/
test_optional.py
application/
__init__.py
some_module.py
Then, when you invoke pytest, you invoke it by doing pytest test/unit
if you want to run just the unit tests (i.e. only test_something*.py
files), or pytest test/integration
if you want to run just the integration tests (i.e. only test_optional.py
), or pytest test
if you want to run all the tests. So, by default, you can just run pytest test/unit
.
I recommend wrapping these calls in some sort of script. I prefer make
since it is powerful for this type of wrapping. Then you can say make test
and it just runs your default (fast) test suite, or make test_all
, and it'll run all the tests (which may or may not be slow).
Example Makefile you could wrap with:
.PHONY: all clean install test test_int test_all uninstall
all: install
clean:
rm -rf build
rm -rf dist
rm -rf *.egg-info
install:
python setup.py install
test: install
pytest -v -s test/unit
test_int: install
pytest -v -s test/integration
test_all: install
pytest -v -s test
uninstall:
pip uninstall app_name
2) Mark your tests judiciously with the @pytest.mark.skipif
decorator, but use an environment variable as the trigger
I don't like this solution as much, it feels a bit haphazard to me (it's hard to tell which set of tests are being run on any give pytest
run). However, what you can do is define an environment variable and then rope that environment variable into the module to detect if you want to run all your tests. Environment variables are shell dependent, but I'll pretend you have a bash environment since that's a popular shell.
You could do export TEST_LEVEL="unit"
for just fast unit tests (so this would be your default), or export TEST_LEVEL="all"
for all your tests. Then in your test files, you can do what you were originally trying to do like this:
import os
...
@pytest.mark.skipif(os.environ["TEST_LEVEL"] == "unit")
def test_scrape_website():
...
Note: Naming the test levels "unit" and "integration" is irrelevant. You can name them whatever you want. You can also have many many levels (like maybe nightly tests or performance tests).
Also, I think option 1 is the best way to go, since it not only clearly allows separation of testing, but it can also add semantics and clarity to what the tests mean and represent. But there is no "one size fits all" in software, you'll have to decide what approach you like based on your particular circumstances.
HTH!