Caching files from your data lake is a feature in early development.

You can choose to cache a particular file by calling:

SELECT duckdb.cache('s3://your-bucket/path/to/file');

This query will block while the file is downloaded. While the download is occuring, you will not be able to query this file.

Once completed, the cached file will be automatically used for future queries to this file.

Managing Files

Currently files must be removed manually from disk. Files are stored in the duckdb_cache directory inside the Postgres data directory. In the future, you will be able to set a maximum cache size and delete files manually from SQL.