Error while importing Kaggle dataset on Colab

不问归期 提交于 2019-12-03 15:35:33

It suddenly stopped working here as well. Apparently, the kaggle api was not searching the kaggle.json file in the correct place. Since I was using the kaggle api inside a colab notebook, I was importing the kaggle.json like this:

from googleapiclient.discovery import build
import io, os
from googleapiclient.http import MediaIoBaseDownload
from google.colab import auth

auth.authenticate_user()

drive_service = build('drive', 'v3')
results = drive_service.files().list(
        q="name = 'kaggle.json'", fields="files(id)").execute()
kaggle_api_key = results.get('files', [])

filename = "/content/.kaggle/kaggle.json"
os.makedirs(os.path.dirname(filename), exist_ok=True)

request = drive_service.files().get_media(fileId=kaggle_api_key[0]['id'])
fh = io.FileIO(filename, 'wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print("Download %d%%." % int(status.progress() * 100))
os.chmod(filename, 600)

It worked just fine. But now, the kaggle api searches the kaggle.json in this location:

~/.kaggle/kaggle.json

So, I just had to move/copy the file I downloaded to the right place:

!mkdir ~/.kaggle
!cp /content/.kaggle/kaggle.json ~/.kaggle/kaggle.json

And it started working again.

This simple thing did it for me on Google Cola.

!echo '{"username":"USERNAME","key":"KEY"}' > ~/.kaggle/kaggle.json
!kaggle datasets download -d mmoreaux/environmental-sound-classification-50

--

edit, might have changed to:

!echo '{"username":"USERNAME","key":"KEY"}' > /root/.kaggle/kaggle.json
!kaggle datasets download -d mmoreaux/environmental-sound-classification-50

Initially had trouble copying the .json file into the colab VM. Eventually for me the following worked: working through google colaboratory, first you need to install the kaggle API with:

!pip install kaggle

Further information and instructions here https://github.com/Kaggle/kaggle-api. Next, the link instructs you to activate the API with a file you can download with your kaggle user on kaggle.com -> My account -> create new API token. this file is kaggle.json.

Next, in order to upload this kaggle.json file to the colab VM for activation, you can upload it first to your google drive (simply drag it to your drive). Next enter the following command in colab to import your drive:

from google.colab import drive
drive.mount('/content/gdrive')

after authorization is completed, you can copy the file from the drive to colab with:

!cp /content/gdrive/My\ Drive/kaggle.json ~/.kaggle/kaggle.json

And Finally, hopefully you will be able to run the command:

!kaggle competitions download -c <competition-name>

I hope this helps!

Check the permissions on your kaggle.json file as well. I got the same error because after running a different a kaggle command, I got this warning:

Warning: Your Kaggle API key is readable by other users on this system! To fix this, you can run 'chmod 600 /...etc/kaggle.json'

I ran what they suggested, and got the same error you did until I changed the permissions back to what they'd been before.

This is my own machine (the other user is a mentor I trust), so I used chown 666 /.../kaggle.json and that solved it, but be careful and only give permissions based on that make sense with your respective setup.

five Easy steps:

Step 1: Import the drive

from google.colab import drive
drive.mount('/content/gdrive')

Get authorize code from https://accounts.google.com/o/oauth2/auth?client_id=xxx and enter the code at Enter your authorization code:

Step 2: Download the kaggle.json file in the local system

kaggle.com -> My account -> create

Step 3: Upload the kaggle.json file. Click > at top left corner of Colab to get

panel -> Files -> UPLOAD

Step 4: Copy the file to Colab

!cp /your path/kaggle.json ~/.kaggle/kaggle.json

Step 5: Fix Warning

Your Kaggle API key is readable by other users on this system!

!chmod 600 /root/.kaggle/kaggle.json

TEST

!pip install kaggle
import kaggle
!kaggle competitions list --csv

RESULT

ref,deadline,category,reward,teamCount,userHasEntered digit-recognizer,2030-01-01 00:00:00,Getting Started,Knowledge,2867,False titanic,2030-01-01 00:00:00,Getting Started,Knowledge,11221,False house-prices-advanced-regression-techniques,2030-01-01 00:00:00,Getting Started,Knowledge,4353,True imagenet-object-localization-challenge,2029-12-31 07:00:00,Research,Knowledge,40,False competitive-data-science-predict-future-sales,2019-12-31 23:59:00,Playground,Kudos,2780,False two-sigma-financial-news,2019-07-15 23:59:00,Featured,"$100,000",2927,False aerial-cactus-identification,2019-07-08 23:59:00,Playground,Knowledge,377,False jigsaw-unintended-bias-in-toxicity-classification,2019-06-26 23:59:00,Featured,"$65,000",982,False inaturalist-2019-fgvc6,2019-06-10 23:59:00,Research,Kudos,75,False freesound-audio-tagging-2019,2019-06-10 11:59:00,Research,"$5,000",250,False

Looks like the home directory in Colab changed recently from /content to /root. Using ~ in paths to refer to HOME rather than hard-coding /content will fix.

I've updated the step-by-step workflow in this answer to reflect the changes. Sorry for the trouble!

Make sure you installed kaggle api first: pip install kaggle. Then grab your API tokens from https://www.kaggle.com/kaggle_user_name/account:

And just download your data for the competition (in here dogs-vs-cats-redux-kernels-edition)

! touch /root/.kaggle/kaggle.json
! chmod 600 /root/.kaggle/kaggle.json
! echo '{"username":"kaggle_user_name","key":"0000000000000000000000000000000000"}' >> /root/.kaggle/kaggle.json
! kaggle competitions download -c "dogs-vs-cats-redux-kernels-edition"
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!