问题
The following code will work to select data from two tables:
SELECT t1.foo, t2.bar FROM TABLE1 t1 INNER JOIN TABLE2 t2 ON t1.foo=t2.foo
I could just as easily written
SELECT t2.foo, t2.bar FROM TABLE1 t1 INNER JOIN TABLE2 t2 ON t1.foo=t2.foo
t1.foo
or t2.foo
: six of one or half a dozen of the other. Why not just foo
?
I've been wonder why doesn't the SQL server just automatically return the data without me specifying one table or the other since the choice is entirely arbitrary (as far as I can tell).
I can make up a scenario where you would need to specify the table, such as
SELECT t1.foo, t2.bar FROM TABLE1 t1 INNER JOIN TABLE t2 ON t1.foo+=t2.foo
However, such scenarios are far from the norm in my experience.
Can anyone enlighten me as to why the language is designed so that I have to make this seemingly arbitrary decision in my code?
回答1:
Because equality in MS SQL doesn't necessarily mean they are equal in the values you want. Consider the following 2 values for foo: "Bar", "baR". Sql will believe them to be equal with respect to the join because of the case insensitivity inherent in the comparison, but which one were you asking for? SQL Server doesn't know, and it can't guess. You must explicitly tell it.
Edit:As @Lukas Eder brought up, not all implementations of SQL use case insensitive comparisons. I know MS SQL uses case insensitivity, and my answer is geared with this concept in mind.
回答2:
Your reasoning is not quite true. While t1.foo = t2.foo
may hold true, that doesn't mean they're the same. Some examples:
- One could be
VARCHAR(1)
the otherVARCHAR(2)
- One could be
VARCHAR(1)
the otherNUMBER(1)
t1
could be a simple table, whereast2
is a view (or nested select) that makes hyper-complex calculations for the value offoo
. The projection cost of eitherfoo
might not be the same in some RDBMS.
And there are dozens of other reasons, why it would be ambigous to just write foo
回答3:
If you're sure that the columns represent the same thing you could join with a USING clause.
SELECT foo, t2.bar FROM TABLE1 t1 INNER JOIN TABLE2 t2 USING (foo);
Otherwise there's no guarantee that t1.foo is the same thing as t2.foo
回答4:
In this case you have a INNER JOIN so it's clear that the decision is arbitrary. But there are many situations where even if you join on FOO the 2 are not the same.
EX: in case of a LEFT JOIN OR in case you have something like ON t1.foo = t2.foo+/-/whater
The engine needs your input to know where to take the data from.
回答5:
The reason you need to make this decision is that it isn't arbitrary. The system does not know which table has the data you want. You need to specify it. When the system designs the execution plan, it does not figure out which columns contain the same data in both tables. As far as it is concerned, these two columns could have different data. It isn't going to extrapolate that since you are saying these columns are equal that it could display either column when one isn't specified.
回答6:
In that particular case, t1.foo
and t2.foo
are the same thing, but the engine isn't optimized for that (and would be confusing if it was). What if your join did something where they may not be the same, like this?
SELECT t2.foo, t2.bar FROM TABLE1 t1 INNER JOIN TABLE2 t2 ON t1.foo<t2.foo
Since we are using <
, foo on t1 and t2 could be very different things. The engine can't "guess" in this case.
Just because those experiences are "far from the norm" in your experience, the engine has to allow for it, otherwise it would make some types of queries extremely difficult to write.
回答7:
SQL doesn't do it because it simply doesn't resolve ambiguities. (But as you note they are equivalent.)
For application lifecycle it's ultimately better to resolve them yourself, because if a column changes name or the join type changes, your code is less likely to be broken and it's more obvious what your intentions were. But those benefits weren't intentional, I'm sure.
来源:https://stackoverflow.com/questions/5966331/ambiguous-column-name-sql-error-with-inner-join-why