Colab setup and Unsloth install

Configuring Google Colab to use a GPU and installing Unsloth gives you a ready-made environment for fine-tuning 7B–8B parameter models without a local GPU. Unsloth is designed to make LoRA/QLoRA training faster (often 2× or more) and to use significantly less VRAM than standard Hugging Face + PEFT setups, so a free Colab T4 (15GB VRAM) or an A100 with Colab Pro is enough to run this codelab.

Why Colab + Unsloth?

Fine-tuning LLMs usually demands powerful hardware. Colab plus Unsloth changes that:

No local GPU: Training runs entirely in the browser; you don't need a powerful machine.
Free tier: Colab's free T4 GPU is enough for QLoRA fine-tuning of 8B models with Unsloth.
Unsloth: Uses optimized kernels and memory layouts so that LoRA/QLoRA training is faster and fits in less VRAM than default PyTorch/transformers pipelines.

Enabling the GPU runtime

Colab notebooks default to a CPU runtime. You need to switch to a GPU so PyTorch and Unsloth can use CUDA.

Go to Google Colab and sign in with your Google account.
Create a new notebook: File → New notebook.
Open Runtime → Change runtime type.
Set Hardware accelerator to T4 GPU (or A100 if you have Colab Pro).
Click Save.

After the runtime restarts, the notebook is attached to a VM with a GPU. Confirm it in a cell:

# Confirm GPU is attached and visible to the system
!nvidia-smi

You should see a Tesla T4 (or similar) and about 15GB VRAM. Keep this runtime active for the rest of the codelab.

Installing Unsloth and dependencies

Unsloth must be installed in the same environment as transformers, datasets, accelerate, and (for training) the standard Trainer setup. In Colab, the recommended way is to install from PyPI; the package pulls in compatible versions of its dependencies.

PyPI install

Run the following in a single Colab cell. Installation can take a few minutes.

# Install Unsloth for Colab (Python 3.10, CUDA 12.x)
# See https://docs.unsloth.ai/get-started/install for other platforms
!pip install unsloth

Unsloth's pip package typically pulls in compatible versions of transformers, torch, datasets, and related libraries.

Alternative: install from source

If you hit version conflicts or need a specific commit, you can install from the Unsloth repository with the Colab extra:

# Optional: install from source with colab-new extra for reproducibility
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

For the official one-line install (including pip or uv), check the Unsloth installation page.

Verifying the environment

After installation, confirm that PyTorch sees the GPU and that Unsloth imports correctly.

import torch
print("CUDA available:", torch.cuda.is_available())
print("Device:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU")

import unsloth
print("Unsloth version:", unsloth.__version__)

You should see CUDA available: True, the GPU name (e.g. Tesla T4), and an Unsloth version string. If any of these fail, ensure the runtime is set to GPU and that you restarted the runtime after changing it.

Optional: Hugging Face login

If you plan to save or push the fine-tuned adapter to the Hugging Face Hub, log in once in the notebook:

from huggingface_hub import login
# Get your token from https://huggingface.co/settings/tokens
login(token="YOUR_HF_TOKEN")  # or use notebook_login() for an interactive prompt

You can skip this now and add it before the step where you save or push the model.

Summary

You now have a Colab runtime with a GPU and Unsloth installed. The same setup will be used in the next steps to load a base model, prepare your dataset, and run training.

Key takeaways:

GPU runtime — Colab must be set to T4 GPU (or A100) via Runtime → Change runtime type.
Unsloth — Install with pip install unsloth; it brings in compatible transformers, torch, and datasets.
Verification — Use torch.cuda.is_available() and import unsloth to confirm the environment before continuing.

In the next step you will choose a base model and load it with Unsloth's FastLanguageModel in 4-bit (QLoRA) or 16-bit (LoRA) so it fits in Colab's VRAM.

In this step