How to Upload Data on Google Colab

Data science is nothing without data. Yes, that'southward obvious. What is not so obvious is the series of steps involved in getting the data into a format which allows yous to explore the data. You lot may be in possession of a dataset in CSV format (short for comma-separated values) but no thought what to do next. This post will help you lot get started in data science by assuasive y'all to load your CSV file into Colab.

Colab (short for Colaboratory) is a free platform from Google that allows users to code in Python. Colab is substantially the Google Suite version of a Jupyter Notebook. Some of the advantages of Colab over Jupyter include an easier installation of packages and sharing of documents. However, when loading files like CSV files, it requires some extra coding. I volition show you lot three ways to load a CSV file into Colab and insert it into a Pandas dataframe.

(Note: there are Python packages that carry common datasets in them. I will not talk over loading those datasets in this commodity.)

To starting time, log into your Google Account and get to Google Drive. Click on the New button on the left and select Colaboratory if information technology is installed (if non click on Connect more apps, search for Colaboratory and install it). From in that location, import Pandas every bit shown below (Colab has it installed already).

          import pandas every bit pd        

1) From Github (Files < 25MB)

The easiest manner to upload a CSV file is from your GitHub repository. Click on the dataset in your repository, then click on View Raw. Copy the link to the raw dataset and store it as a cord variable chosen url in Colab equally shown beneath (a cleaner method but it's not necessary). The last step is to load the url into Pandas read_csv to become the dataframe.

          url = 'copied_raw_GH_link'          df1 = pd.read_csv(url)          # Dataset is now stored in a Pandas Dataframe        

2) From a local drive

To upload from your local drive, commencement with the following code:

          from google.colab import files
uploaded = files.upload()

Information technology will prompt y'all to select a file. Click on "Choose Files" then select and upload the file. Wait for the file to be 100% uploaded. You lot should run into the proper noun of the file once Colab has uploaded it.

Finally, blazon in the following code to import it into a dataframe (make certain the filename matches the name of the uploaded file).

          import io          df2 = pd.read_csv(io.BytesIO(uploaded['Filename.csv']))          # Dataset is now stored in a Pandas Dataframe        

3) From Google Bulldoze via PyDrive

This is the most complicated of the three methods. I'll bear witness it for those that have uploaded CSV files into their Google Drive for workflow control. First, type in the following code:

          # Lawmaking to read csv file into Colaboratory:          !pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# Authenticate and create the PyDrive customer.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

When prompted, click on the link to become authentication to allow Google to access your Drive. You should run across a screen with "Google Cloud SDK wants to access your Google Account" at the summit. After you let permission, copy the given verification lawmaking and paste it in the box in Colab.

In one case you have completed verification, become to the CSV file in Google Drive, correct-click on it and select "Get shareable link". The link will exist copied into your clipboard. Paste this link into a string variable in Colab.

          link = 'https://bulldoze.google.com/open?id=1DPZZQ43w8brRhbEMolgLqOWKbZbE-IQu' # The shareable link        

What you want is the id portion after the equal sign. To become that portion, type in the following code:

          fluff, id = link.separate('=')          impress (id) # Verify that you have everything after '='        

Finally, type in the following code to get this file into a dataframe

          downloaded = bulldoze.CreateFile({'id':id})            
downloaded.GetContentFile('Filename.csv')
df3 = pd.read_csv('Filename.csv')
# Dataset is at present stored in a Pandas Dataframe

Final Thoughts

These are three approaches to uploading CSV files into Colab. Each has its benefits depending on the size of the file and how one wants to organize the workflow. Once the data is in a nicer format like a Pandas Dataframe, y'all are fix to get to work.

Bonus Method — My Drive

Thank you so much for your support. In honor of this article reaching 50k Views and 25k Reads, I'm offering a bonus method for getting CSV files into Colab. This ane is quite simple and clean. In your Google Drive ("My Drive"), create a folder called data in the location of your choosing. This is where you will upload your data.

From a Colab notebook, blazon the following:

          from google.colab import bulldoze
drive.mount('/content/bulldoze')

Just like with the tertiary method, the commands volition bring you to a Google Authentication stride. You should run across a screen with Google Drive File Stream wants to access your Google Account. After you allow permission, re-create the given verification lawmaking and paste it in the box in Colab.

In the notebook, click on the charcoal > on the tiptop left of the notebook and click on Files. Locate the data folder you created earlier and discover your data. Right-click on your data and select Re-create Path. Store this copied path into a variable and you are ready to become.

          path = "copied path"
df_bonus = pd.read_csv(path)
# Dataset is now stored in a Pandas Dataframe

What is great nigh this method is that you tin can access a dataset from a separate dataset folder you created in your own Google Drive without the extra steps involved in the third method.

lippertreave1937.blogspot.com

Source: https://towardsdatascience.com/3-ways-to-load-csv-files-into-colab-7c14fcbdcb92

0 Response to "How to Upload Data on Google Colab"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel