问题
I've got a table with multiple dated snapshots per user and a table with the latest date of the snapshot for each user (generated via a query).
I've tried a number of variations to get a simple join of the two to work but I'm having no luck. I want to select all records from the snapshots table that match the user id and date from the other table.
I've gota variety of errors, but this is the latest (sub-selects and renames were done to debug what field might be causing the problem):
SELECT t1.uuid, t1.username, t1.d
FROM (SELECT uuid, username, date AS d FROM [Activity.user_snapshots]) as t1
JOIN EACH (SELECT uuid, date AS dg FROM [Activity.latest_snapshots]) as t2
ON t1.uuid = t2.uuid AND t1.d = t2.dg;
The error response that I get in this case is:
Error: Field 'dg' not found in table '__S0'.
When I've tried the much more straight-forward query:
SELECT t1.uuid, t1.username, t1.date
FROM [Activity.user_snapshots] as t1
JOIN EACH [Activity.latest_snapshots] as t2
ON t1.uuid = t2.uuid AND t1.date = t2.date;
I get this error:
Error: Field date from table __S0 is not a leaf field.
Any ideas?
回答1:
There is a bug joining on timestamp values. If you coerce them to their underlying microsecond values, you should be good. This query should work:
SELECT t1.uuid, t1.username, USEC_TO_TIMESTAMP(t1.d)
FROM (
SELECT uuid, username, TIMESTAMP_TO_USEC(date) AS d
FROM [Activity.user_snapshots]) as t1
JOIN EACH (
SELECT uuid, TIMESTAMP_TO_USEC(date) AS dg
FROM [Activity.latest_snapshots]) as t2
ON t1.uuid = t2.uuid AND t1.d = t2.dg;
回答2:
In case it's helpful to anyone else. The problem that I was having was that when I created the latest_snapshots table since I had to convert the STRING date field into a timestamp to do a MAX operator on it, it was saved to the resulting table as a timestamp object.
So the error messages are misleading. Annoyingly, I had to create a new table where I converted the timestamp back into a string object since there was no way to do that in the JOIN-ON clause.
If anyone knows how to do all of this is one query without all the extra table creation, that would be cool. Thus far, my attempts to do it with sub-selects have failed.
Note the join on timestamp issue was fixed in a previous release; please let us know if you continue to see problems with it.
来源:https://stackoverflow.com/questions/17054338/bigquery-subselect-in-join-does-not-recognize-fields