Cloud AI Platform Training Fails to Read from Bucket

自闭症网瘾萝莉.ら 提交于 2021-01-29 20:22:14

问题


I'm trying to use Cloud AI Platform for training (gcloud ai-platform jobs submit training). I created my bucket and am sure the training file is there (gsutil ls gs://sat3_0_bucket/data/train_input.csv).

However, my job is failing with log messsage:

File "/root/.local/lib/python3.7/site-packages/ktrain/text/data.py", line 175, in texts_from_csv
    with open(train_filepath, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'gs://sat3_0_bucket/data/train_input.csv'

Am I missing something?


回答1:


The error is probably happening because ktrain tries to auto-detect the character encoding using open(train_filepath, 'rb') which may be problematic with Google Cloud Storage. One solution is to explicitly provide the encoding to texts_from_csv as an argument so this step is skipped (default is None, which means auto-detect).

Alternatively, you can read the data in yourself as a pandas DataFrame using one of these methods. For instance, pandas evidently supports GCS, so you can simply do this: df = pd.read_csv('gs://bucket/your_path.csv')

Then, using ktrain, you can use ktrain.text.texts_from_df (or ktrain.text.texts_from_array) to load and preprocess your data.



来源:https://stackoverflow.com/questions/62460368/cloud-ai-platform-training-fails-to-read-from-bucket

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!