The rules for picking which class template specialization is preferred involve rewriting the specializations into function templates and determining which function template
Clang is being GCC-compatible (and compatible with existing code that depends on both of these behaviors).
Consider [temp.deduct.type]p1:
[...] an attempt is made to find template argument values (a type for a type parameter, a value for a non-type parameter, or a template for a template parameter) that will make P, after substitution of the deduced values (call it the deduced A), compatible with A.
The crux of the issue is what "compatible" means here.
When partially ordering function templates, Clang merely deduces in both directions; if deduction succeeds in one direction but not the other, it assumes that means the result will be "compatible", and uses that as the ordering result.
When partially ordering class template partial specializations, however, Clang interprets "compatible" as meaning "the same". Therefore it only considers one partial specialization to be more specialized than another if substituting the deduced arguments from one of them into the other would reproduce the original partial specialization.
Changing either of these two to match the other breaks substantial amounts of real code. :(