I\'m struggling to get my head around the difference between the following two TypeVars
from typing impor
After a bunch of reading, I believe mypy correctly raises the type-var error in the OP's question:
generics.py:31: error: Value of type variable "T" of "X" cannot be "AA"
See the below explanation.
Second Case: TypeVar("T", bound=Union[A, B])
I think @Michael0x2a's answer does a great job of describing what's happening. See that answer.
First Case: TypeVar("T", A, B)
The reason boils down to Liskov Substitution Principle (LSP), also known as behavioral subtyping. Explaining this is outside the scope of this answer, you will need to read up on + understanding the meaning of invariance vs covariance.
From python's typing docs for TypeVar:
By default type variables are invariant.
Based on this information, T = TypeVar("T", A, B) means type variable T has value restrictions of classes A and B, but because it's invariant... it only accepts those two (and not any child classes of A or B).
Thus, when passed AA, mypy correctly raises a type-var error.
You might then say: well, doesn't AA properly match behavioral subtyping of A? And in my opinion, you would be correct.
Why? Because one can properly substitute out and A with AA, and the behavior of the program would be unchanged.
However, because mypy is a static type checker, mypy can't figure this out (it can't check runtime behavior). One has to state the covariance explicitly, via the syntax covariant=True.
Also note: when specifying a covariant TypeVar, one should use the suffix _co in type variable names. This is documented in PEP 484 here.
from typing import TypeVar, Generic
class A: pass
class AA(A): pass
T_co = TypeVar("T_co", AA, A, covariant=True)
class X(Generic[T_co]): pass
class XA(X[A]): pass
class XAA(X[AA]): pass
Output: Success: no issues found in 1 source file
So, what should you do?
I would use TypeVar("T", bound=Union[A, B]), since:
A and B aren't relatedFurther reading on LSP-related issues in mypy:
When you do T = TypeVar("T", bound=Union[A, B]), you are saying T can be bound to either Union[A, B] or any subtype of Union[A, B]. It's upper-bounded to the union.
So for example, if you had a function of type def f(x: T) -> T, it would be legal to pass in values of any of the following types:
Union[A, B] (or a union of any subtypes of A and B such as Union[A, BChild])A (or any subtype of A)B (or any subtype of B)This is how generics behave in most programming languages: they let you impose a single upper bound.
But when you do T = TypeVar("T", A, B), you are basically saying T must be either upper-bounded by A or upper-bounded by B. That is, instead of establishing a single upper-bound, you get to establish multiple!
So this means while it would be legal to pass in values of either types A or B into f, it would not be legal to pass in Union[A, B] since the union is neither upper-bounded by A nor B.
So for example, suppose you had a iterable that could contain either ints or strs.
If you want this iterable to contain any arbitrary mixture of ints or strs, you only need a single upper-bound of a Union[int, str]. For example:
from typing import TypeVar, Union, List, Iterable
mix1: List[Union[int, str]] = [1, "a", 3]
mix2: List[Union[int, str]] = [4, "x", "y"]
all_ints = [1, 2, 3]
all_strs = ["a", "b", "c"]
T1 = TypeVar('T1', bound=Union[int, str])
def concat1(x: Iterable[T1], y: Iterable[T1]) -> List[T1]:
out: List[T1] = []
out.extend(x)
out.extend(y)
return out
# Type checks
a1 = concat1(mix1, mix2)
# Also type checks (though your type checker may need a hint to deduce
# you really do want a union)
a2: List[Union[int, str]] = concat1(all_ints, all_strs)
# Also type checks
a3 = concat1(all_strs, all_strs)
In contrast, if you want to enforce that the function will accept either a list of all ints or all strs but never a mixture of either, you'll need multiple upper bounds.
T2 = TypeVar('T2', int, str)
def concat2(x: Iterable[T2], y: Iterable[T2]) -> List[T2]:
out: List[T2] = []
out.extend(x)
out.extend(y)
return out
# Does NOT type check
b1 = concat2(mix1, mix2)
# Also does NOT type check
b2 = concat2(all_ints, all_strs)
# But this type checks
b3 = concat2(all_ints, all_ints)