问题
How would I compare two tables (Table1
and Table2
) and find all the new entries or changes in Table2
.
Using SQL Server I can use
Select * from Table1
Except
Select * from Table2
Here a sample of what I want
Table1
A | 1
B | 2
C | 3
Table2
A | 1
B | 2
C | 2
D | 4
So, if I comparing the two tables I want my results to show me the following
C | 2
D | 4
I tried a few statements with no luck.
回答1:
Now that I have your actual sample dataset, I can write a query that finds every domain in one table that is not on the other table:
https://bigquery.cloud.google.com/table/inbound-acolyte-377:demo.1024 has 24,729,816 rows. https://bigquery.cloud.google.com/table/inbound-acolyte-377:demo.1025 has 24,732,640 rows.
Let's look at everything in 1025 that is not in 1024:
SELECT a.domain
FROM [inbound-acolyte-377:demo.1025] a
LEFT OUTER JOIN EACH [inbound-acolyte-377:demo.1024] b
ON a.domain = b.domain
WHERE b.domain IS NULL
Result: 39,629 rows. (8.1s elapsed, 2.04 GB processed)
回答2:
To get the differences (given that tkey is your unique row identifier):
SELECT a.tkey, a.name, b.name
FROM [your.tableold] a
JOIN EACH [your.tablenew] b
ON a.tkey = b.tkey
WHERE a.name != b.name
LIMIT 100
For the new rows, one way is the one you proposed:
SELECT col1, col2
FROM table2
WHERE col1 NOT IN
(SELECT col1 FROM Table1)
(you'll have to switch to a JOIN EACH when Table1 gets too large)
来源:https://stackoverflow.com/questions/19575599/compare-tables-in-bigquery