DuckDB allows you to efficiently query the data in your data lake. With Hydra, you can now query that data directly from Postgres, including:

  • Write queries with JOINs that combine data from your data lake with data in your Postgres tables, views, and materialized views
  • Copy data to your data lake from your Postgres tables
  • Write views that encapsulate logic of querying your data lake

Data Lake Providers

Hydra allows you to query data in S3, R2, and GCS. To enable a particular data lake, you must provide a credential that has access to the data in question. This is done by adding a secret to the duckdb.secrets table:

INSERT INTO duckdb.secrets
(cloud_type, cloud_id, cloud_secret, cloud_region)
VALUES ('S3', 'access_key_id', 'secret_accss_key', 'us-east-1');
  • The cloud_type must be one of R2, S3, or GCS.
  • For R2, the r2_account_id column is also used. For details, see the R2 documentation.

Please note that these secrets are unencrypted. You may wish to restrict access to this table to trusted users.

Formats

The following formats are supported: