Best practices in Google Colab for sharing with non-technical teams#
In teams where technical and non-technical profiles coexist, it is common for the technical team to develop notebooks in Google Colab for periodic processes: monthly reports, data analysis, recurring tasks. The problem arises when these processes require periodic execution with small variations — a different month, another department, a new input file — and the responsibility of running them always falls on the technical team.
This dynamic creates unnecessary overhead: the technical team becomes a bottleneck for tasks that, with the right structure, anyone could run autonomously.
The solution is to structure notebooks so that non-technical teams can execute them on their own, without risk of altering the logic and without needing to understand the code. This notebook presents the best practices I have applied in my experience and also serves as a downloadable template you can adapt to your own processes.
Recommended structure#
The structure that has worked best for me organizes the notebook into five well-defined blocks:
Documentation: Markdown cells explaining what the process does, what data it needs, what results it produces and how to use it.
Imports and Drive initialization: First code cell, always visible. Requests Google Drive permissions as early as possible so the user grants access immediately and the rest runs uninterrupted.
Editable parameters: Visible code cell with Colab form fields. The only values the non-technical team should modify.
Technical variables and processing logic: Hidden code cells to avoid confusion. Contain internal configurations and functions.
Main execution: Visible code cell that runs the full process and displays progress messages and results.
Process documentation#
Every shared notebook must start with clear documentation. It is the first impression the user gets and determines whether they can use it without help. The documentation should include: what the process does, what input data it needs (files and location), what output data it generates (files and location) and step-by-step instructions for running it.
Below is an example of documentation for a report generation process.
Process: Monthly departmental report.
Objective: Generate a monthly summary from daily operations data for a department.
Input data: CSV file in
My Drive/reports/data/nameddata_YYYY_MM.csv(e.g.,data_2026_02.csv).Output data: CSV report in
My Drive/reports/results/namedreport_DEPARTMENT_YYYY_MM.csv(e.g.,report_sales_2026_02.csv).Instructions:
Run the first code cell and authorize Google Drive access.
Modify the parameters (year, month and department) in the Parameters cell.
From the menu, select Runtime > Run all (or
Ctrl+F9).Wait for completion. Progress will be shown in the last cell.
Imports and Drive initialization#
This block is placed as the first code cell for two reasons:
Google Drive requires user authorization. If this cell comes first, the user grants permission immediately and the rest of the notebook runs without interruptions.
Early error detection: if there is a problem with the Drive connection or any dependency, it is caught before executing any logic.
# @title Default title text
import pandas as pd
from datetime import datetime
from pathlib import Path
from google.colab import drive
drive.mount("/content/drive")
print("Drive connected successfully.")
Mounted at /content/drive
Drive connected successfully.
Parameters (non-technical team)#
This is the only cell the non-technical team should modify.
Parameters are presented as Colab form fields using the
#@param syntax, which generates visual controls (text fields,
dropdowns, etc.) that make editing easy without touching code.
An important tip: use closed lists
(#@param ["option1", ...]) instead of free text whenever
possible. This prevents typos and ensures valid values.
#@title Report parameters
year = 2026 #@param {type:"integer"}
month = 2 #@param {type:"integer"}
department = "sales" #@param ["sales", "marketing", "operations"]
Technical variables (technical team)#
The following cells contain internal configurations that the
non-technical team does not need to see or modify. In Google
Colab, cells are hidden using #@title as the first line and
setting the cell view to form (cellView: form). This
collapses them visually, showing only the title.
To configure this:
Add
#@title Descriptive titleas the first line of the code cell.Click the three dots on the cell > Form view.
Or set
"cellView": "form"in the cell metadata.
#@title Technical variables (do not modify)
BASE_DIR = Path("/content/drive/MyDrive/reports")
INPUT_DIR = BASE_DIR / "data"
OUTPUT_DIR = BASE_DIR / "results"
INPUT_FILE = INPUT_DIR / f"data_{year}_{month:02d}.csv"
OUTPUT_FILE = OUTPUT_DIR / f"report_{department}_{year}_{month:02d}.csv"
REQUIRED_COLUMNS = ["date", "department", "category", "amount"]
Processing logic (technical team)#
The processing logic is also hidden. All functionality is encapsulated in functions with descriptive names. This makes maintenance easier for the technical team and prevents the non-technical team from accidentally modifying the logic.
#@title Processing functions (do not modify)
def validate_params(year, month, department):
"""Validates that parameters are correct."""
errors = []
if not (2020 <= year <= 2030):
errors.append(f"Year out of range: {year}")
if not (1 <= month <= 12):
errors.append(f"Invalid month: {month}")
if department not in ["sales", "marketing", "operations"]:
errors.append(f"Unrecognized department: {department}")
return errors
def load_data(path):
"""Loads the CSV file and validates columns."""
if not path.exists():
raise FileNotFoundError(
f"File not found: {path}\n"
f"Please verify the file exists in Google Drive."
)
df = pd.read_csv(path)
missing = [c for c in REQUIRED_COLUMNS if c not in df.columns]
if missing:
raise ValueError(f"Missing columns in file: {missing}")
return df
def process_data(df, department):
"""Filters and aggregates data by department."""
df_filtered = df[df["department"] == department].copy()
if df_filtered.empty:
raise ValueError(
f"No data found for department: {department}"
)
summary = (
df_filtered
.groupby("category")
.agg(
total=("amount", "sum"),
count=("amount", "count"),
average=("amount", "mean"),
)
.reset_index()
.sort_values("total", ascending=False)
)
return summary
def save_report(df, path):
"""Saves the report to the specified path."""
path.parent.mkdir(parents=True, exist_ok=True)
df.to_csv(path, index=False)
Execution#
This cell runs the full process. It remains visible so the non-technical team can verify progress and results. Log messages indicate each step.
print(f"{'=' * 50}")
print(f"Report: {department} - {year}/{month:02d}")
print(f"{'=' * 50}")
print()
# Step 1: Validate parameters
print("▶ Validating parameters...")
errors = validate_params(year, month, department)
if errors:
for error in errors:
print(f" ✗ {error}")
raise SystemExit("Process stopped due to parameter errors.")
print(" ✓ Parameters valid.")
# Step 2: Load data
print(f"▶ Loading data from: {INPUT_FILE.name}")
df = load_data(INPUT_FILE)
print(f" ✓ {len(df)} records loaded.")
# Step 3: Process data
print(f"▶ Processing data for '{department}'...")
summary = process_data(df, department)
print(f" ✓ {len(summary)} categories in summary.")
# Step 4: Save report
print(f"▶ Saving report to: {OUTPUT_FILE.name}")
save_report(summary, OUTPUT_FILE)
print(f" ✓ Report saved successfully.")
print()
print(f"{'=' * 50}")
print("Process completed.")
print(f"{'=' * 50}")
==================================================
Report: sales - 2026/02
==================================================
▶ Validating parameters...
✓ Parameters valid.
▶ Loading data from: data_2026_02.csv
✓ 300 records loaded.
▶ Processing data for 'sales'...
✓ 4 categories in summary.
▶ Saving report to: report_sales_2026_02.csv
✓ Report saved successfully.
==================================================
Process completed.
==================================================
Naming conventions#
One of the most important best practices is using fixed or programmatic format names for files and folders. This prevents human errors and makes automation easier.
Names to avoid#
Example |
Problem |
|---|---|
|
Space in name, no year, free-form text |
|
Inconsistent casing, uppercase extension |
|
Special characters, manual versioning |
Recommended format#
Type |
Format |
Example |
|---|---|---|
Input data |
|
|
Output report |
|
|
Logs |
|
|
Name variables (year, month, department) are built programmatically from the parameters. The non-technical team never types file names manually.
Time execution considerations#
Google Colab has time limits worth knowing about:
Free sessions disconnect after ~90 minutes of inactivity or 12 hours of continuous execution.
Colab Pro offers longer sessions but is still limited.
Recommendations:
Save intermediate results: If the process is long, save partial results to Drive after each stage. If the session disconnects, progress is not lost.
Progress indicators: Use
print()ortqdmso the user knows the process is still running.Batch processing: If data is very large, process in chunks instead of loading everything into memory.
Error handling: Wrap execution in
try/exceptblocks to save partial results on failure.
Best practices summary#
Document the process at the beginning of the notebook.
Place imports and Drive connection as the first code cell.
Separate editable parameters in a visible cell with Colab forms (
#@param).Hide technical variables and processing logic using
#@titleand form view.Use a visible execution cell with progress messages.
Use fixed or programmatic file names, never free-form text.
Account for Colab time execution limits.
Prefer closed lists over free text fields for parameters.
References#
Google Colab FAQ. Google.
Adding form fields to Colab notebooks. Google Colab.
Mounting Google Drive in Colab. Google Colab.