Compare Tables in BigQuery

喜欢而已 提交于 2019-12-05 00:55:28

问题


How would I compare two tables (Table1 and Table2) and find all the new entries or changes in Table2.

Using SQL Server I can use

Select * from Table1
Except
Select * from Table2

Here a sample of what I want

Table1

 A   |  1
 B   |  2
 C   |  3

Table2

 A   |  1
 B   |  2
 C   |  2
 D   |  4

So, if I comparing the two tables I want my results to show me the following

C   |   2
D   |   4

I tried a few statements with no luck.


回答1:


Now that I have your actual sample dataset, I can write a query that finds every domain in one table that is not on the other table:

https://bigquery.cloud.google.com/table/inbound-acolyte-377:demo.1024 has 24,729,816 rows. https://bigquery.cloud.google.com/table/inbound-acolyte-377:demo.1025 has 24,732,640 rows.

Let's look at everything in 1025 that is not in 1024:

SELECT a.domain
FROM [inbound-acolyte-377:demo.1025] a
LEFT OUTER JOIN EACH [inbound-acolyte-377:demo.1024] b
ON a.domain = b.domain
WHERE b.domain IS NULL

Result: 39,629 rows. (8.1s elapsed, 2.04 GB processed)




回答2:


To get the differences (given that tkey is your unique row identifier):

SELECT a.tkey, a.name, b.name
FROM [your.tableold] a
JOIN EACH [your.tablenew] b
ON a.tkey = b.tkey
WHERE a.name != b.name
LIMIT 100

For the new rows, one way is the one you proposed:

SELECT col1, col2
FROM table2
WHERE col1 NOT IN
  (SELECT col1 FROM Table1)

(you'll have to switch to a JOIN EACH when Table1 gets too large)



来源:https://stackoverflow.com/questions/19575599/compare-tables-in-bigquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!