Observable Framework 1.0.0 GitHub

DuckDB

DuckDB is “an in-process SQL OLAP Database Management System. DuckDB-Wasm brings DuckDB to every browser thanks to WebAssembly.” DuckDB-Wasm is available by default as duckdb in Markdown, but you can explicitly import it as:

import * as duckdb from "npm:@duckdb/duckdb-wasm";

For convenience, we provide a DatabaseClient implementation on top of DuckDB-Wasm, DuckDBClient. This is also available by default in Markdown, but you can explicitly import it like so:

import {DuckDBClient} from "npm:@observablehq/duckdb";

To get a DuckDB client, pass zero or more named tables to DuckDBClient.of. Each table can be expressed as a FileAttachment, Arquero table, Arrow table, an array of objects, or a promise to the same. For example, below we load a sample of 250,000 stars from the Gaia Star Catalog as a Apache Parquet file:

const db = DuckDBClient.of({gaia: FileAttachment("gaia-sample.parquet")});

Now we can run a query using db.sql to bin the stars by right ascension (ra) and declination (dec):

const bins = db.sql`SELECT
  floor(ra / 2) * 2 + 1 AS ra,
  floor(dec / 2) * 2 + 1 AS dec,
  count() AS count
FROM
  gaia
GROUP BY
  1,
  2`

These bins can quickly be turned into a heatmap with Plot’s raster mark, showing the milky way.

Plot.plot({
  aspectRatio: 1,
  x: {domain: [0, 360]},
  y: {domain: [-90, 90]},
  marks: [
    Plot.frame({fill: 0}),
    Plot.raster(bins, {
      x: "ra",
      y: "dec",
      fill: "count",
      width: 360 / 2,
      height: 180 / 2,
      imageRendering: "pixelated"
    })
  ]
})

For externally-hosted data, you can create an empty DuckDBClient and load a table from a SQL query, say using read_parquet or read_csv.

const db = await DuckDBClient.of();

await db.sql`CREATE TABLE addresses
  AS SELECT *
  FROM read_parquet('https://static.data.gouv.fr/resources/bureaux-de-vote-et-adresses-de-leurs-electeurs/20230626-135723/table-adresses-reu.parquet')
  LIMIT 100`;

As an alternative to db.sql, there’s also db.query:

db.query("SELECT * FROM gaia LIMIT 10")

And db.queryRow:

db.queryRow("SELECT count() AS count FROM gaia")

See the DatabaseClient Specification for more details.