Attaching Data¶
Data is added to the bundle via the .attach() method.
Basic Usage¶
Attaching to a Join Pack¶
By default, data attaches to the base pack. Use the pack parameter to attach data to a joined pack instead.
Path Resolution¶
The attach() method handles paths flexibly:
- Paths can be any supported URL and the data will be read from there.
- Paths can be relative to the data_dir. But NOT
..to a parent dir.
Attaching From Another Bundle¶
You can attach the query output of another committed bundle using a bundle:// URL. This reads the target bundle's full query output — including any filters, column operations, and joins that have been applied.
For filesystem bundles, use bundle:// followed by the path:
For remote bundles (S3, etc.), use the compound scheme bundle+<scheme>://:
Note
The target bundle must be committed. The attached data reflects the target's full query output at read time — including any filters, column operations, and joins that have been applied.
Supported Formats¶
- CSV (
.csv) - TSV (
.tsv) — tab-separated values - JSON Lines (
.json,.jsonl) - Parquet (
.parquet)
Note
Only JSON Lines format (one JSON object per line) can be directly attached. For arbitrary JSON files — including API responses with wrapper objects, nested structures, or JSON arrays — use a connector (CREATE SOURCE USING http, remote_dir, etc.) with json_record_path. The connector transforms and copies the data into the bundle as Parquet. See Sources: JSON Options.
Column Types¶
CSV and TSV files are imported with all columns as text (Utf8). Because these are text-based formats, type inference from sampled rows is unreliable — a column that looks numeric in the first 100 rows might contain non-numeric values later. By defaulting to text, bundlebase avoids silent data corruption.
JSON files retain their native types (string, number, boolean) since the JSON format encodes types directly in the data.
Parquet files retain their native types since the schema is embedded in the file.
To convert text columns to specific types after attaching CSV data, use cast_column:
See Cast Column for more details.
Detaching Data¶
Remove a previously attached block by its location with detach_block().
Replacing Data¶
Swap where a block's data is read from without changing the block's identity with replace_block().