Data Workshop

Columnar analytics on your own data, from Quebec.

Upload a CSV, Parquet, or JSON file. Run standard SQL against it. Get results in milliseconds — without paying per query, paying for egress, or letting your data leave Canada.

Three steps

Datasets group your tables — same shape as a BigQuery dataset or a schema. Each dataset is a single DuckDB file on disk, with the raw source files kept alongside for lineage.

1. Create a dataset

From your dashboard, click Data Workshop → New dataset. Pick a slug; that's how you'll reference it everywhere.

2. Upload a file

Drag a .csv, .parquet, .json, or .jsonl file into the Uploads tab. The file lives untouched alongside your dataset — you can re-ingest with a different table name later.

3. Run SQL

Switch to the Query tab. Type SQL. Hit Cmd/Ctrl + Enter. Results stream back; the underlying engine fans out across every core on the box.

SELECT region,
       AVG(price)    AS avg_price,
       COUNT(*)      AS orders,
       SUM(quantity) AS units
  FROM sales
 GROUP BY region
 ORDER BY avg_price DESC;

Why it's fast

PostgreSQL stores data row by row. SELECT AVG(price) FROM orders reads every column of every row — even when you only asked for one. Data Workshop uses DuckDB, which stores each column separately. Aggregations touch only the columns the query references, often 2-5% of the on-disk bytes.

On top of that, DuckDB parallelizes a single query across every available core, processes data in vectorized batches, and supports automatic predicate pushdown into Parquet metadata. On 10-million-row tables, you get BigQuery-like ergonomics on a Quebec-hosted box, without the per-byte-scanned fee.

What you can't do here: scan a 10 TB table across 5,000 workers. For that, BigQuery is still the right tool. Data Workshop is for the long tail of dashboards, ad-hoc analysis, and BI pipelines that fit comfortably in tens of gigabytes — the workloads that don't justify a warehouse but outgrow Postgres + manual aggregation.

Safe to expose user SQL

User-supplied queries run read-only with filesystem access disabled — read_csv_auto('/etc/passwd') and similar reach off-DB calls return a clean Permission Error. Each query gets a cgroup-bounded memory cap, a wall-clock timeout, and its own subprocess; a runaway aggregation OOMs inside its scope, never on the host.

Supported formats and SQL

Ingest accepts .csv, .parquet, .json, and .jsonl / .ndjson. Types are inferred from a full scan on ingest, so columns parse cleanly even when the first 1,000 rows look the same. Queries use DuckDB SQL — a Postgres-compatible superset with window functions, CTEs, full JSON path operators, and array types. Standard ANSI SQL works.

Plan tiers

Bundled into every Canner plan. Starter gives you enough to try real work; Basic and Pro scale to genuine BI workloads.

StorageDaily queriesQuery timeoutPer-query memory
Starter100 MB50 / day5 s256 MB
Basic5 GB2,000 / day30 s512 MB
Pro50 GBunlimited5 min1 GB

Where your data lives

On disk in Montreal, owned by the deploy user on the same VPS that runs your projects. No third-party data plane. No telemetry on the contents of your queries — only aggregate counts (succeeded / failed / timeout) for billing. The query log shows you which SQL ran when; the result rows themselves are never persisted beyond the response.

What's next

Phase 2 ships scheduled queries (run nightly, results cached), CSV/Parquet export from any query, and federated reads against your project's tenant Postgres database (no ETL — query Postgres + uploaded files in the same SELECT). Phase 3 adds a canner workshop CLI subcommand for piping query results into a terminal.

Open the dashboard and try it.

Starter is free forever and includes Data Workshop. Upload a CSV in under a minute.

Start free