问题
select
CASE
WHEN .....
ELSE .....
END AS carrier,
count(vehicle_id) as cnt
from test.vehicle_info
WHERE vehicle_id NOT IN(select hardware_id
from TABLE_DATE_RANGE(test.gps32_,DATE_ADD(CURRENT_TIMESTAMP(), -6, 'DAY'),DATE_ADD(CURRENT_TIMESTAMP(), -1, 'DAY')))
group by carrier
order by cnt
And I got this error:
Query Failed
Error: Table too large for JOIN. Consider using JOIN EACH. For more details, please see https://developers.google.com/bigquery/docs/query-reference#joins
Job ID: red-road-574:job_e2o6sBjO9Dt5QrU_cRM2VHSRTso
What was the reason and how to solve it?
回答1:
@Hobbs's guess above is correct. SEMIJOIN (using WHERE ... IN ...
) and ANTIJOIN (using WHERE ... NOT IN ...
) are implemented as JOIN operations. The way to work around these restrictions is to rewrite as a join yourself, using join EACH. That is:
select
CASE
WHEN .....
ELSE .....
END AS carrier,
count(vi.vehicle_id) as cnt
from test.vehicle_info vi
LEFT OUTER JOIN EACH (select hardware_id FROM TABLE_DATE_RANGE(...)) hi
ON vi.vechicle_id = hi.hardware_id
WHERE hi.hardware_id is NULL
group by carrier
order by cnt
来源:https://stackoverflow.com/questions/25530188/too-large-to-join-error-when-im-not-using-join