How to access files from Google Cloud Storage in Colab Notebooks
Recently, while working with a large dataset, I wanted to use Google Cloud Storage in my Colab notebook. In this post, I will show how to access these files in Google Colab Notebook.
It turns out that even if your file is public, you can’t do a simple curl
or wget
to access the file stored on Google Cloud Storage.
Here’s how you can upload and download files on Google Cloud Storage.
Step 1: Authenticate using your Google Account
Firstly, you need to authenticate yourself in Colab. Once you run the code below, it will ask you to follow a link to login and enter an access token that you receive upon successful login.
from google.colab import auth
auth.authenticate_user()
Step 2: Install the GCloud SDK
We would be using the gsutil
command to upload and download files. So we first need to install the GCloud SDK.
!curl https://sdk.cloud.google.com | bash
Step 3: Init the SDK
Next, init the SDK to configure the project settings.
!gcloud init
Once, you run the above command, it will ask you a few questions to configure the SDK.
Step 4: Upload and Download files
Finally, you are all set to upload and download files using Google Cloud Storage.
Download file from Cloud Storage to Google Colab
!gsutil cp gs://maskaravivek-data/data_file.csv .
Upload file from Google Colab to Cloud
gsutil cp test.csv gs://maskaravivek-data/
That’s it. Here’s the Colab notebook for your reference:
Using Google Cloud Storage might not be the ideal solution for you and most of the time mounting Google Drive should suffice.
You can buy me a coffee if this post really helped you learn something or fix a nagging issue!
Written on June 8, 2020 by Vivek Maskara.
Originally published on Medium