I have a googlesheet where a column may contain no information in it. While iterating through the rows and looking at that column, if the column is blank, it\'s not returni
The way I solved this issue was converting the values into a Pandas dataframe. I fetched the particular columns that I wanted in my Google Sheets, then converted those values into a Pandas dataframe. Once I converted my dataset into a Pandas dataframe, I did some data formatting, then converted the dataframe back into a list. By converting the list to a Pandas dataframe, each column is preserved. Pandas already creates null values for empty trailing rows and columns. However, I needed to also convert the non trailing rows with null values to keep consistency.
# Authenticate and create the service for the Google Sheets API
credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES)
http = credentials.authorize(Http())
discoveryUrl = ('https://sheets.googleapis.com/$discovery/rest?version=v4')
service = discovery.build('sheets', 'v4',
http=http,discoveryServiceUrl=discoveryUrl)
spreadsheetId = 'id of your sheet'
rangeName = 'range of your dataset'
result = service.spreadsheets().values().get(
spreadsheetId=spreadsheetId, range=rangeName).execute()
values = result.get('values', [])
#convert values into dataframe
df = pd.DataFrame(values)
#replace all non trailing blank values created by Google Sheets API
#with null values
df_replace = df.replace([''], [None])
#convert back to list to insert into Redshift
processed_dataset = df_replace.values.tolist()