[Python] Beginner’s Guide on Google Colab

Before you can analyze data, you need a place to write and run code. Traditionally, this meant installing software on your computer, which involves downloading programs, configuring environment variables, managing package dependencies, and troubleshooting version conflicts. Google Colab eliminates this entirely. It is a free, browser-based coding environment that runs Python (and R) without any installation. If you can use Google Docs, you can use Colab 👩🏻‍💻

Cloud Computing

When you run Python on your laptop, your computer does all the work. Your processor executes calculations, your RAM (Random Access Memory) holds data temporarily while the program runs, and your hard drive stores files. RAM size matters: if your dataset exceeds available RAM, your program will crash or freeze. A laptop with 8GB of RAM may struggle with datasets containing hundreds of thousands of rows.

Cloud computing means using remote servers over the internet. When you run code in Google Colab, your code executes on Google’s servers, not on your laptop. Your browser is simply a window into that remote machine. Think of it like this: instead of cooking in your tiny apartment kitchen, you’re renting time in a professional commercial kitchen with industrial equipment. You still decide what to cook and how to cook it, but the heavy lifting happens in a better-equipped space.

Local ComputingCloud Computing (Colab)
Where code runsYour computerGoogle’s servers
Setup requiredInstall Python, configure paths, manage dependenciesNone (browser only)
RAM availabilityLimited by your hardware (typically 8-16GB)~12GB free tier, up to 51GB with Pro
Hardware constraintsFixed to your machine’s specsAccess to GPU/TPU for intensive tasks
File accessLocal files directlyVia Google Drive or upload
Internet requiredNoYes
Session persistenceAlways availableDisconnects after ~90 min idle
CostFree (uses your hardware)Free tier available; Pro for more resources

The tradeoff is that you need an internet connection, and Google imposes time limits on free usage (your session disconnects after a period of inactivity). For learning and most practical analyses, these limitations rarely matter. If you require higher ram / more sessions, then you can decide whether you want to upgrade it to the paid tier or choose pay-as-yo-go option. Google Colab also offers no-cost pro membership for students and educators.

https://colab.research.google.com/signup

While its set to Python by default, Colab also supports R/Julia, another language popular in statistics and academic research. If you’ve used R before or encounter R code in published research, you can run it in Colab by creating an R notebook (File → New notebook → Change runtime type → R). The concepts in this guide—cells, execution, file management—work the same way regardless of language.

Getting Started: Creating Your First Notebook

Go to colab.research.google.com and sign in with your Google account. Click “New notebook” to create a blank notebook.

A Colab notebook contains both code and text, organized into “cells.” Each cell is a block you can run independently. This differs from traditional programming where you write an entire program and run it all at once. The cell-based approach lets you work incrementally: write code, run it, see the result, then continue. When something goes wrong, you only need to fix that specific cell.

Your new notebook opens with one empty code cell. Type the following and press Shift + Enter to run it:

Python
print("Hello, Social World!")

You should see “Hello, Social Work!” appear below the cell. The “Hello World” program is a tradition in programming education dating back to the 1970s, a simple test to confirm everything is working 🙂

Understanding Cells: Code vs. Text

Colab notebooks contain two types of cells.

Code cells contain Python instructions that the computer executes. When you run a code cell, Python reads your instructions, performs the requested operations, and displays any output. Code cells have a gray background and a play button (▶) on the left.

Text cells (also called Markdown cells) contain human-readable notes, explanations, and documentation. They don’t execute as code—they’re just for communication. You might use text cells to explain your analysis methodology, document your data sources, or write interpretations of your results.

To add a new cell, you can use the “+ Code” or “+ Text” buttons in the toolbar. But keyboard shortcuts are faster once you learn them.

Essential Keyboard Shortcuts

Learning a few keyboard shortcuts will dramatically speed up your workflow. In Colab, there are two modes: edit mode (when you’re typing inside a cell) and command mode (when you’ve clicked outside a cell or pressed Escape).

In Colab, there are two modes: edit mode (typing inside a cell) and command mode (clicked outside a cell or pressed Escape).

In command mode (press Escape first):

B : Insert a new cell below the current cell
A : Insert a new cell above the current cell
M : Convert the current cell to Markdown (text)
Y : Convert the current cell to code
D D : Delete the current cell (press D twice)
Z : Undo cell deletion

In any mode:

Shift + Enter : Run the current cell and move to the next one
Ctrl + Enter : Run the current cell and stay on it
Ctrl + S : Save the notebook

Try this sequence: press Escape button to enter command mode, then press B to add a new cell below. Press M to convert it to a text cell. Now click into the cell and type some text.

Markdown: Formatting Your Text Cells

When you create a text cell, you write in Markdown. Markdown lets you create headers, bold text, links, and more using plain text symbols. When you share an analysis with your supervisor or return to your own work six months later, the text cells explain what each section does and why.

Here are the basics:

Python
# This is a large header (like a title)
## This is a medium header (like a section)
### This is a small header (like a subsection)

Regular text just appears as regular text.

**This text will be bold**
*This text will be italic*

You can create links like this: [link text](https://example.com)

The # symbol creates headers. One # is the largest, two ## is smaller, three ### is smaller still. Use headers to organize your notebook into sections like “Data Import,” “Cleaning,” “Analysis,” and “Results.” You can also add the image here!

After typing Markdown, press Shift + Enter to render it. The raw symbols disappear and you see the formatted text. Double-click the cell to edit it again.

Comments in Code: The # Symbol

Inside code cells, the # symbol has a different purpose: it creates comments. Any text after # on a line is ignored by Python.

Python
# This entire line is a comment - Python ignores it
client_age = 34  # This comment explains what this line does

Comments serve a different purpose than Markdown text cells. Text cells are for high-level explanations aimed at readers. Comments are for low-level notes embedded within code, often explaining specific technical decisions. Good code comments explain why, not what. The code itself shows what happens; comments explain the reasoning:

Python
# Using median instead of mean because income data has extreme outliers
central_tendency = df['income'].median()

# PHQ-9 cutoff of 10 based on Kroenke et al. (2001) clinical guidelines
df['depression_flag'] = df['phq9_score'] >= 10

Configuring Your Runtime

Remember that Colab runs on Google’s servers, not your computer. The “runtime” is the virtual machine Google allocates to run your code. You can configure this machine’s specifications.

Click Runtime → Change runtime type to see your options.

Hardware accelerator: For most data analysis, you’ll use “None” (meaning CPU only). GPU (Graphics Processing Unit) and TPU (Tensor Processing Unit) accelerate machine learning computations.

RAM: The free tier provides around 12 GB of RAM. If you’re working with large datasets and run out of memory, you might see errors like “Your session crashed after using all available RAM.” Colab Pro and Pro+ subscriptions offer more RAM (up to 51 GB), but for learning and most practical analyses, the free tier is sufficient.

Disk space: Each session provides temporary disk storage. Files you create exist only during your session unless you save them to Google Drive.

To check your current resources, you can run:

Python
# Check available RAM
import psutil
ram_gb = psutil.virtual_memory().total / (1024**3)
print(f"Available RAM: {ram_gb:.1f} GB")

You can also see your RAM/Disk setting for your notebook session by clicking Ram/Disk button on the top – left side of notebook. It will show how much RAM/Disk space is being used in real time.

Connecting to Google Drive

Your Colab session is a temporary virtual computer. When the session ends, any files you created disappear. To preserve your work, you need to connect Colab to permanent storage: specifically, your Google Drive.

This connection is called “mounting.” The metaphor comes from how operating systems handle external storage: when you plug in a USB drive, the system “mounts” it, making its contents accessible through the file system. Mounting Google Drive makes your Drive folders accessible to your Colab code.

Run this code to mount your Drive:

Python
from google.colab import drive
drive.mount('/content/drive')

A popup will ask you to authorize access. Select your Google account and grant permission. After authorization, you’ll see “Mounted at /content/drive” printed below the cell.

After authorization, your Drive is accessible at /content/drive/MyDrive/. Click the folder icon in the left sidebar to browse and right-click any file to copy its path. You can browse this visually by clicking the folder icon (📁) in the left sidebar. Navigate through the folders and right-click any file to copy its path.

I recommend putting your mount command at the top of every notebook. Each session is a fresh machine without Drive connected, so you need to mount every time you start working.

Reading Data Files

Once Drive is mounted, you can read files using pandas, Python’s primary library for working with tabular data.

Python
import pandas as pd

# Read a CSV file from Google Drive
df = pd.read_csv('/content/drive/MyDrive/SW672/client_data.csv')

# Display the first 5 rows
df.head()

The pd.read_csv() function reads a CSV file and creates a DataFrame—pandas’ structure for working with tables. The .head() method shows the first five rows so you can verify the data loaded correctly.

For Excel files, use pd.read_excel():

# Excel files require the openpyxl package
df = pd.read_excel('/content/drive/MyDrive/SW672/survey_responses.xlsx')

Uploading Files Directly

For quick, one-time analyses, you can upload files directly without mounting Drive:

Python
from google.colab import files

uploaded = files.upload()  # Opens a file picker dialog

After running this, click the “Choose Files” button and select a file from your computer. The file uploads to the session’s temporary storage and you can read it by filename alone:

Python
df = pd.read_csv('uploaded_file.csv')

Remember: directly uploaded files disappear when your session ends. For anything you want to keep, use Google Drive.

Saving Your Work

Saving happens at two levels: saving your notebook and saving data files.

Notebooks save automatically to your Google Drive under “Colab Notebooks.” You can also save manually with Ctrl + S or File → Save. Notebooks persist even when sessions end—you’re saving the document, not the running session.

Data files that your code creates need explicit saving. To save a DataFrame to Drive:

Python
# Save DataFrame as CSV to Google Drive
df.to_csv('/content/drive/MyDrive/SW672/analysis_results.csv', index=False)
print("Saved successfully!")

The index=False parameter prevents pandas from adding an extra column of row numbers.

To download a file directly to your local computer:

Python
from google.colab import files

df.to_csv('results.csv', index=False)
files.download('results.csv')

Installing Additional Packages

Colab comes with common data science packages pre-installed (pandas, numpy, matplotlib, seaborn, scikit-learn). Occasionally you’ll need something else.

The ! prefix tells Colab to run a system command rather than Python code. To install a package:

Python
!pip install openpyxl  # For reading Excel files
!pip install geopandas  # For geographic data

Installed packages persist only for the current session. If you rely on additional packages, put the install commands at the top of your notebook so they run whenever you start a new session.

Troubleshooting Common Issues

“Your session crashed after using all available RAM”

Your data is too large for available memory. Options: load only necessary columns (pd.read_csv(..., usecols=['col1', 'col2'])), process data in chunks, or filter early to reduce size.

“FileNotFoundError: No such file or directory”

The path is wrong. Check for typos, verify Drive is mounted, and use the file browser to copy the exact path.

Session disconnected / “Runtime disconnected”

Your session timed out due to inactivity (around 90 minutes of no interaction) or ran for too long (up to 12 hours for free tier). Reconnect, re-mount Drive, and re-run your cells from the top.

“ModuleNotFoundError: No module named ‘xxx'”

The package isn’t installed. Run !pip install xxx and then import again.

A classic programming joke: “The two hardest problems in computer science are cache invalidation, naming things, and off-by-one errors.” You’ll encounter plenty of errors as you learn—this is normal and expected. Each error message, frustrating as it seems, teaches you something about how the system works.

Resources

Google Colab Introduction Notebook: https://colab.research.google.com/notebooks/intro.ipynb
Google Colab FAQ: https://research.google.com/colaboratory/faq.html
Markdown Guide: https://www.markdownguide.org/basic-syntax/
Python Official Tutorial: https://docs.python.org/3/tutorial/
pandas Documentation: https://pandas.pydata.org/docs/

  • December 23, 2025