问题
When writing correlated sub-queries in BigQuery using a WHERE clause, it is resulting in an Error. LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.
However it works in other SQL engines, I am not understanding the error or how to rewrite the queries.
Different queries of the same type causing the same error.
SELECT
out.*
FROM
`august-monolith-205810.Assignment.students` AS out
WHERE
2 >= (
SELECT
COUNT(*)
FROM
`august-monolith-205810.Assignment.students` AS t1
WHERE
out.age < t1.age)
reporting students who are older than at most two other students.
回答1:
Below is for BigQuery Standard SQL
#standardSQL
SELECT * EXCEPT(cnt)
FROM (
SELECT *, COUNT(1) OVER(ORDER BY age RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) cnt
FROM `august-monolith-205810.Assignment.students`
)
WHERE cnt BETWEEN 1 AND 2
Another, less effective and thus not recommended but still an option is to use CROSS JOIN to mimic non-equality LEFT JOIN
#standardSQL
SELECT ANY_VALUE(out).*
FROM `august-monolith-205810.Assignment.students` AS out
CROSS JOIN `august-monolith-205810.Assignment.students` AS t1
WHERE out.age > t1.age
GROUP BY out.id
HAVING COUNT(t1.age) <= 2
回答2:
Horrible thing you are trying to do, but lets think about what you DO have. For every "OUT" record, you are requerying every assignment again to get the count. What I would suggest is a 3-level query. The inner-most is to get all distinct "age" values (hoping integer based). Then get the counts LESS than to create a pre-aggregate result. Then you can join to the original out based on the outer-most OUT value vs repeated querying. Something like
select
out
from
`august-monolith-205810.Assignment.students` AS out
JOIN ( select
UniqAge.Age,
( SELECT COUNT(*)
FROM `august-monolith-205810.Assignment.students`
WHERE age < UniqAge.age) CountUnderAge
from
( select distinct age
from `august-monolith-205810.Assignment.students` ) UniqAge
) PQByAge
on out.age = PQByAge
AND PQByAge.CountUnderAge <= 2
This query is based on the Age column being INTEGER based (non decimal).
来源:https://stackoverflow.com/questions/55972351/correlated-sub-query-causing-this-error-left-outer-join-cannot-be-used-without