How do I Authenticate a Service Account to Make Queries against a GDrive Sheet Backed BigQuery Table?

醉酒当歌 提交于 2019-12-02 08:17:21

You should be able to get this working with the following steps:

First share the sheet with the email/"service account id" associated with the service account.

Then you'll be able to access your sheet-backed table if you create a Client with the bigquery and drive scopes. (You might need to have domain-wide-delegation enabled on the service account).

scopes = ['https://www.googleapis.com/auth/bigquery', 'https://www.googleapis.com/auth/drive']

credentials = ServiceAccountCredentials.from_json_keyfile_name(
'<path_to_json>', scopes=scopes)

# Instantiates a client
client = bigquery.Client(project = PROJECT, credentials = credentials)

bqQuery = client.run_sync_query(q)
bqQuery.run()
bqQuery.fetch_data()

While Orbit's answer helped me to find a solution for the issue, there are a few more things you need to consider. Therefore, I like to add my detailed solution to the problem. This solution is required if Orbit's basic solution does not work, in particular, if you use the G Suite and your policies do not allow sharing sheets/docs with accounts outside of your domain. In this case you cannot directly share a doc/sheet with the service account.

Before you start:

  1. Create or select a service account in your project
  2. Enable Domain-wide Delegation (DwD) in the account settings. If not present, this generates an OAuth client ID for the service account.
  3. Make sure the delegated user@company.com has access to the sheet.
  4. Add the required scopes to your service account's OAuth client (you may need to ask a G Suite admin to do this for you):

    • https://www.googleapis.com/auth/bigquery
    • https://www.googleapis.com/auth/drive

If the delegated user can access your drive-based table in the BigQuery UI, your service account should now also be able to access it on behalf of the delegated user.

Here is a full code snippet that worked for me:

#!/usr/bin/env python

import httplib2
from google.cloud import bigquery
from oauth2client.service_account import ServiceAccountCredentials

scopes = [
    "https://www.googleapis.com/auth/drive",
    "https://www.googleapis.com/auth/bigquery",
]

delegated_user = "user@example.com"
project        = 'project-name'
table          = 'dataset-name.table-name'
query          = 'SELECT count(*) FROM [%s:%s]' % (project, table)

creds = ServiceAccountCredentials.from_json_keyfile_name('secret.json', scopes=scopes)
creds = creds.create_delegated(delegated_user)

http = creds.authorize(httplib2.Http())
client = bigquery.Client(http=http)

bq = client.run_sync_query(query)
bq.run()
print bq.fetch_data()

Note that I was not able to setup the delegation directly and needed to create an HTTP client using creds = creds.create_delegated(delegated_user) and http = creds.authorize(httplib2.Http()). The authorized HTTP client can then be used as HTTP client for the BigQuery client: client = bigquery.Client(http=http).

Also note that the service account does not need to have any predefined roles assigned in the project settings, i.e., you do not have to make it a bigquery user or even a project owner. I suppose it acquires access primarily via delegation.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!