pandas to gbq claims a schema mismatch while the schema's are exactly the same. On github all the issues are claimed to have been solved in 2017

∥☆過路亽.° 提交于 2020-08-07 06:50:28

问题


I am trying to append a table to a different table through pandas, pulling the data from BigQuery and sending it to a different BigQuery dataset. While the table schema is exactly the same i get the error " "Please verify that the structure and " pandas_gbq.gbq.InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table."

This error occurred earlier where I went for table overwrites but in this case the datasets are too large to do that (and that is not a sustainable solution).

    df = pd.read_gbq(query, project_id="my-project", credentials=bigquery_key,
                     dialect='standard')
    pd.io.gbq.to_gbq(df, dataset, projectid,
                     if_exists='append',
                     table_schema=[{'name': 'Date','type': 'STRING'},
                                   {'name': 'profileId','type': 'STRING'},
                                   {'name': 'Opco','type': 'STRING'},
                                   {'name': 'country','type': 'STRING'},
                                   {'name': 'deviceType','type': 'STRING'},
                                   {'name': 'userType','type': 'STRING'},
                                   {'name': 'users','type': 'INTEGER'},
                                   {'name': 'sessions','type': 'INTEGER'},
                                   {'name': 'bounceRate','type': 'FLOAT'},
                                   {'name': 'sessionsPerUser','type': 'FLOAT'},
                                   {'name': 'avgSessionDuration','type': 'FLOAT'},
                                   {'name': 'pageviewsPerSession','type': 'FLOAT'}
                                   ],
                     credentials=bigquery_key)

The schema in BigQuery is as follows:

Date                STRING      
profileId           STRING  
Opco                STRING  
country             STRING  
deviceType          STRING  
userType            STRING  
users               INTEGER 
sessions            INTEGER 
bounceRate          FLOAT   
sessionsPerUser     FLOAT   
avgSessionDuration  FLOAT   
pageviewsPerSession FLOAT   

I then get the following error:

Traceback (most recent call last):   File "..file.py", line 63, in
<module>
    main()
  File "..file.py", line 57, in main
    updating_general_data(bigquery_key)
  File "..file.py", line 46, in updating_general_data
    credentials=bigquery_key)
  File
"..\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\gbq.py",
line 162, in to_gbq
    credentials=credentials, verbose=verbose, private_key=private_key)
  File
"..\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas_gbq\gbq.py",
line 1141, in to_gbq
     "Please verify that the structure and " pandas_gbq.gbq.InvalidSchema: Please verify that the structure and
data types in the DataFrame match the schema of the destination table.

To me it seems that there is a 1 on 1 match. I've seen other threads talk about this and these threads are mainly talking about date formats even though the date format is already a string in this case and is then with the table_schema still made as string.


回答1:


Most likely, the problem arises because the column names in the DataFrame and Schema do not match



来源:https://stackoverflow.com/questions/56545738/pandas-to-gbq-claims-a-schema-mismatch-while-the-schemas-are-exactly-the-same

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!