1. Create a dataset
From your dashboard, click Data Workshop → New dataset. Pick a slug; that's how you'll reference it everywhere.
Upload a CSV, Parquet, or JSON file. Run standard SQL against it. Get results in milliseconds — without paying per query, paying for egress, or letting your data leave Canada.
Datasets group your tables — same shape as a BigQuery dataset or a schema. Each dataset is a single DuckDB file on disk, with the raw source files kept alongside for lineage.
From your dashboard, click Data Workshop → New dataset. Pick a slug; that's how you'll reference it everywhere.
Drag a .csv, .parquet, .json, or .jsonl file into the Uploads tab. The file lives untouched alongside your dataset — you can re-ingest with a different table name later.
Switch to the Query tab. Type SQL. Hit Cmd/Ctrl + Enter. Results stream back; the underlying engine fans out across every core on the box.
SELECT region,
AVG(price) AS avg_price,
COUNT(*) AS orders,
SUM(quantity) AS units
FROM sales
GROUP BY region
ORDER BY avg_price DESC;PostgreSQL stores data row by row. SELECT AVG(price) FROM orders reads every column of every row — even when you only asked for one. Data Workshop uses DuckDB, which stores each column separately. Aggregations touch only the columns the query references, often 2-5% of the on-disk bytes.
On top of that, DuckDB parallelizes a single query across every available core, processes data in vectorized batches, and supports automatic predicate pushdown into Parquet metadata. On 10-million-row tables, you get BigQuery-like ergonomics on a Quebec-hosted box, without the per-byte-scanned fee.
What you can't do here: scan a 10 TB table across 5,000 workers. For that, BigQuery is still the right tool. Data Workshop is for the long tail of dashboards, ad-hoc analysis, and BI pipelines that fit comfortably in tens of gigabytes — the workloads that don't justify a warehouse but outgrow Postgres + manual aggregation.
User-supplied queries run read-only with filesystem access disabled — read_csv_auto('/etc/passwd') and similar reach off-DB calls return a clean Permission Error. Each query gets a cgroup-bounded memory cap, a wall-clock timeout, and its own subprocess; a runaway aggregation OOMs inside its scope, never on the host.
Ingest accepts .csv, .parquet, .json, and .jsonl / .ndjson. Types are inferred from a full scan on ingest, so columns parse cleanly even when the first 1,000 rows look the same. Queries use DuckDB SQL — a Postgres-compatible superset with window functions, CTEs, full JSON path operators, and array types. Standard ANSI SQL works.
Bundled into every Canner plan. Starter gives you enough to try real work; Basic and Pro scale to genuine BI workloads.
On disk in Montreal, owned by the deploy user on the same VPS that runs your projects. No third-party data plane. No telemetry on the contents of your queries — only aggregate counts (succeeded / failed / timeout) for billing. The query log shows you which SQL ran when; the result rows themselves are never persisted beyond the response.
Phase 2 ships scheduled queries (run nightly, results cached), CSV/Parquet export from any query, and federated reads against your project's tenant Postgres database (no ETL — query Postgres + uploaded files in the same SELECT). Phase 3 adds a canner workshop CLI subcommand for piping query results into a terminal.
Starter is free forever and includes Data Workshop. Upload a CSV in under a minute.
Start free