[Python] A Beginner’s Guide to Google Colab: From Setup to Deep Learning using Hugging Face

If you’re new to Python, data science, or deep learning, Google Colab is one of the easiest places to start. It’s like a notebook you can use in your browser without installing anything on your computer — and it even gives you free access to GPUs for deep learning.

👩🏻‍💻 What is Google Colab?

Google Colaboratory (or Colab) is a free Jupyter Notebook environment provided by Google. You can:

  • Write and run Python code in your browser
  • Use free GPUs and TPUs for faster computation
  • Save your work to Google Drive
  • Share notebooks with others easily

You don’t need to install Python, Jupyter, or anything else. Using your Google account, you can simply go to https://colab.research.google.com and start coding!

Step 1: Mount Google Drive (To Access Your Files in youg GDrive)

When working in Colab, your files are not on your computer — they’re in the cloud. So if you want to save your work or load a dataset, it’s best to connect your Google Drive.

Python
from google.colab import drive
import os

drive.mount('/content/drive')

It will open a link and ask you to sign in and give permission. After that, your Google Drive becomes available under /content/drive/MyDrive/.

Step 2: Set a Default Folder (So You Don’t Get Lost)

Let’s say you want to keep all your Colab work in a folder called "Colab Projects" in your Drive. Without setting a folder, you’ll have to write long paths every time.

Python
os.chdir('/content/drive/MyDrive/Colab Projects')
print("Current working directory:", os.getcwd())

Step 3: Turn on GPU or TPU (for Deep Learning!)

By default, Colab uses a regular CPU. For the use of deep learning models, we want to use a GPU (Graphics Processing Unit) or TPU (Tensor Processing Unit). To turn it on,

  1. Go to the top menu: Runtime > Change runtime type
  2. Under “Hardware accelerator,” select GPU or TPU
  3. Click “Save”

You can check if GPU is active:

Python
import torch
print(torch.cuda.is_available())  # Should be True

When using Google Colab for deep learning or LLM tasks, the type of GPU you get matters — it can dramatically affect training speed, memory availability, and what size model you can run.

The availability of each GPU depends on your account status, usage limits, and system load. You may not always see or be able to select high-end GPUs like the A100.

▶️ Colab GPU Types

GPU TypeVRAM (approx.)SpeedNotes
T416 GBModerateCommon for free and Pro users. Good for small to mid-size models.
L424 GBFaster than T4Rare, newer. Sometimes shows up for Pro users.
A10040–80 GBVery fastOnly available to Pro users with compute units. Ideal for LLMs.

Step 4: Install Deep Learning Libraries

Colab comes with some libraries pre-installed, but it’s good practice to install the versions you need.

Python
!pip install torch torchvision
!pip install tensorflow

Step 5: Use Hugging Face Transformers for NLP

If you’re working on Natural Language Processing (NLP), including Large Language Models, you’ve probably heard of BERT, GPT, T5… These are all available through Hugging Face. Hugging Face is a company and open-source platform that makes it easy to use AI models—especially models that understand and generate human language. Think of it as a library full of pre-trained AI brains that you can download and use in your own projects with just a few lines of code.

Whether you want to:

  • Summarize a news article
  • Translate text into another language
  • Analyze emotions in social media posts
  • Chat with an AI assistant

Hugging Face gives you access to the tools and models you need—many of them built by researchers and organizations around the world.

Python
!pip install transformers 

from transformers import pipeline

# Initialize the classifier
classifier = pipeline(
    task="text-classification",
    model="SamLowe/roberta-base-go_emotions",
    top_k=None  # Returns scores for all labels
)

# Sample text
text = "I'm thrilled with the results!"

# Perform classification
results = classifier(text)

# Output the results
for result in results[0]:
    print(f"Label: {result['label']}, Score: {result['score']:.4f}")

This would return the output for your text 🙂

▶️ TIP! Loading Large Language Models in your own Colab:

Colab isn’t just for small models — you can also use it to run Large Language Models (LLMs) like GPT-style transformers for text generation, summarization, chat interfaces, and more.

For example, we can generate text using the following code:

Python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "mistralai/Mistral-7B-v0.1"  # Or any other LLM on Hugging Face

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

prompt = "In the future, artificial intelligence will"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

This will generate coherent text using a real 7B-parameter open-source model. If you run into out-of-memory issues, try smaller models like TinyLlama, Phi-2, or use 4-bit quantized versions.

Before loading any LLM, it’s smart to check how well it performs. Hugging Face hosts a public leaderboard that ranks open-source LLMs based on multiple benchmarks (MMLU, GSM8K, HumanEval, etc.).

https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard

Here is a brief introduction to the metrics here:

  • Average score: Higher is better across general tasks
  • Architecture: Mistral, LLaMA, Gemma, Mixtral, etc.
  • Size: Balance between performance and memory usage (7B models are great for Colab – bigger than 7B, you are likely to encounter errors)

  • April 2, 2025