Best practices in Google Colab for sharing with non-technical teams#

In teams where technical and non-technical profiles coexist, it is common for the technical team to develop notebooks in Google Colab for periodic processes: monthly reports, data analysis, recurring tasks. The problem arises when these processes require periodic execution with small variations — a different month, another department, a new input file — and the responsibility of running them always falls on the technical team.

This dynamic creates unnecessary overhead: the technical team becomes a bottleneck for tasks that, with the right structure, anyone could run autonomously.

The solution is to structure notebooks so that non-technical teams can execute them on their own, without risk of altering the logic and without needing to understand the code. This notebook presents the best practices I have applied in my experience and also serves as a downloadable template you can adapt to your own processes.

Process documentation#

Every shared notebook must start with clear documentation. It is the first impression the user gets and determines whether they can use it without help. The documentation should include: what the process does, what input data it needs (files and location), what output data it generates (files and location) and step-by-step instructions for running it.

Below is an example of documentation for a report generation process.

Process: Monthly departmental report.

Objective: Generate a monthly summary from daily operations data for a department.

Input data: CSV file in My Drive/reports/data/ named data_YYYY_MM.csv (e.g., data_2026_02.csv).

Output data: CSV report in My Drive/reports/results/ named report_DEPARTMENT_YYYY_MM.csv (e.g., report_sales_2026_02.csv).

Instructions:

  1. Run the first code cell and authorize Google Drive access.

  2. Modify the parameters (year, month and department) in the Parameters cell.

  3. From the menu, select Runtime > Run all (or Ctrl+F9).

  4. Wait for completion. Progress will be shown in the last cell.

Imports and Drive initialization#

This block is placed as the first code cell for two reasons:

  1. Google Drive requires user authorization. If this cell comes first, the user grants permission immediately and the rest of the notebook runs without interruptions.

  2. Early error detection: if there is a problem with the Drive connection or any dependency, it is caught before executing any logic.

# @title Default title text
import pandas as pd
from datetime import datetime
from pathlib import Path

from google.colab import drive

drive.mount("/content/drive")
print("Drive connected successfully.")
Mounted at /content/drive
Drive connected successfully.

Parameters (non-technical team)#

This is the only cell the non-technical team should modify. Parameters are presented as Colab form fields using the #@param syntax, which generates visual controls (text fields, dropdowns, etc.) that make editing easy without touching code.

An important tip: use closed lists (#@param ["option1", ...]) instead of free text whenever possible. This prevents typos and ensures valid values. image.png

#@title Report parameters
year = 2026 #@param {type:"integer"}
month = 2 #@param {type:"integer"}
department = "sales" #@param ["sales", "marketing", "operations"]

Technical variables (technical team)#

The following cells contain internal configurations that the non-technical team does not need to see or modify. In Google Colab, cells are hidden using #@title as the first line and setting the cell view to form (cellView: form). This collapses them visually, showing only the title.

To configure this:

  • Add #@title Descriptive title as the first line of the code cell.

  • Click the three dots on the cell > Form view.

  • Or set "cellView": "form" in the cell metadata.

#@title Technical variables (do not modify)
BASE_DIR = Path("/content/drive/MyDrive/reports")
INPUT_DIR = BASE_DIR / "data"
OUTPUT_DIR = BASE_DIR / "results"

INPUT_FILE = INPUT_DIR / f"data_{year}_{month:02d}.csv"
OUTPUT_FILE = OUTPUT_DIR / f"report_{department}_{year}_{month:02d}.csv"

REQUIRED_COLUMNS = ["date", "department", "category", "amount"]

Processing logic (technical team)#

The processing logic is also hidden. All functionality is encapsulated in functions with descriptive names. This makes maintenance easier for the technical team and prevents the non-technical team from accidentally modifying the logic.

#@title Processing functions (do not modify)

def validate_params(year, month, department):
    """Validates that parameters are correct."""
    errors = []
    if not (2020 <= year <= 2030):
        errors.append(f"Year out of range: {year}")
    if not (1 <= month <= 12):
        errors.append(f"Invalid month: {month}")
    if department not in ["sales", "marketing", "operations"]:
        errors.append(f"Unrecognized department: {department}")
    return errors


def load_data(path):
    """Loads the CSV file and validates columns."""
    if not path.exists():
        raise FileNotFoundError(
            f"File not found: {path}\n"
            f"Please verify the file exists in Google Drive."
        )
    df = pd.read_csv(path)
    missing = [c for c in REQUIRED_COLUMNS if c not in df.columns]
    if missing:
        raise ValueError(f"Missing columns in file: {missing}")
    return df


def process_data(df, department):
    """Filters and aggregates data by department."""
    df_filtered = df[df["department"] == department].copy()
    if df_filtered.empty:
        raise ValueError(
            f"No data found for department: {department}"
        )
    summary = (
        df_filtered
        .groupby("category")
        .agg(
            total=("amount", "sum"),
            count=("amount", "count"),
            average=("amount", "mean"),
        )
        .reset_index()
        .sort_values("total", ascending=False)
    )
    return summary


def save_report(df, path):
    """Saves the report to the specified path."""
    path.parent.mkdir(parents=True, exist_ok=True)
    df.to_csv(path, index=False)

Execution#

This cell runs the full process. It remains visible so the non-technical team can verify progress and results. Log messages indicate each step.

print(f"{'=' * 50}")
print(f"Report: {department} - {year}/{month:02d}")
print(f"{'=' * 50}")
print()

# Step 1: Validate parameters
print("▶ Validating parameters...")
errors = validate_params(year, month, department)
if errors:
    for error in errors:
        print(f"  ✗ {error}")
    raise SystemExit("Process stopped due to parameter errors.")
print("  ✓ Parameters valid.")

# Step 2: Load data
print(f"▶ Loading data from: {INPUT_FILE.name}")
df = load_data(INPUT_FILE)
print(f"  ✓ {len(df)} records loaded.")

# Step 3: Process data
print(f"▶ Processing data for '{department}'...")
summary = process_data(df, department)
print(f"  ✓ {len(summary)} categories in summary.")

# Step 4: Save report
print(f"▶ Saving report to: {OUTPUT_FILE.name}")
save_report(summary, OUTPUT_FILE)
print(f"  ✓ Report saved successfully.")

print()
print(f"{'=' * 50}")
print("Process completed.")
print(f"{'=' * 50}")
==================================================
Report: sales - 2026/02
==================================================

▶ Validating parameters...
  ✓ Parameters valid.
▶ Loading data from: data_2026_02.csv
  ✓ 300 records loaded.
▶ Processing data for 'sales'...
  ✓ 4 categories in summary.
▶ Saving report to: report_sales_2026_02.csv
  ✓ Report saved successfully.

==================================================
Process completed.
==================================================

Naming conventions#

One of the most important best practices is using fixed or programmatic format names for files and folders. This prevents human errors and makes automation easier.

Names to avoid#

Example

Problem

report january.csv

Space in name, no year, free-form text

Data_Sales_2026.CSV

Inconsistent casing, uppercase extension

my report (final) v2.csv

Special characters, manual versioning

Time execution considerations#

Google Colab has time limits worth knowing about:

  • Free sessions disconnect after ~90 minutes of inactivity or 12 hours of continuous execution.

  • Colab Pro offers longer sessions but is still limited.

Recommendations:

  1. Save intermediate results: If the process is long, save partial results to Drive after each stage. If the session disconnects, progress is not lost.

  2. Progress indicators: Use print() or tqdm so the user knows the process is still running.

  3. Batch processing: If data is very large, process in chunks instead of loading everything into memory.

  4. Error handling: Wrap execution in try/except blocks to save partial results on failure.

Best practices summary#

  1. Document the process at the beginning of the notebook.

  2. Place imports and Drive connection as the first code cell.

  3. Separate editable parameters in a visible cell with Colab forms (#@param).

  4. Hide technical variables and processing logic using #@title and form view.

  5. Use a visible execution cell with progress messages.

  6. Use fixed or programmatic file names, never free-form text.

  7. Account for Colab time execution limits.

  8. Prefer closed lists over free text fields for parameters.

References#