Configuration
GenDB is configured via gendb.yaml. Pass a custom path with --config.
Precedence
Settings are applied in this order (highest priority first):
gendb.yaml— file-based column instructions- LLM generation — the LLM generates data based on schema context and instructions
Full Reference
llm:
provider: ollama # ollama | openai | custom
model: qwen2.5:7b # Model name
base_url: http://localhost:11434/v1 # LLM API endpoint
api_key: "" # API key (required for openai/custom)
structured_output: true # Use JSON Schema to guarantee valid output
temperature: # Sampling temperature (optional, model default)
chunk_size: 50 # Rows per LLM request
generation:
default_rows: 100 # Rows per table unless overridden
tables: # Per-table overrides
users:
rows: 500
columns:
bio:
prompt: "Write a short professional bio"
role:
generator: one_of
values: ["admin", "user", "moderator"]
orders:
rows: 2000
column_rules: # Pattern-based rules
- pattern: "*_sku"
generator: regex
format: "[A-Z]{3}-[0-9]{6}"
Notes
-
llm.provider— LLM provider for data generation:ollama(default) — local Ollama instanceopenai— OpenAI APIcustom— any OpenAI-compatible endpoint
-
llm.structured_output— Whentrue(the default), GenDB sends a JSON Schema with LLM requests so the model is constrained to produce valid JSON. This dramatically reduces parsing failures, especially with local models. Set tofalseif your model or provider does not support structured output. -
llm.chunk_size— Number of rows generated per LLM call. Larger values mean fewer API calls but longer responses. Default is 50. -
generation.default_rows— Default number of rows to generate per table. Can be overridden per-table in thetablessection. -
generation.tables— Per-table configuration. Each table can specify a custom row count and per-column instructions for the LLM. -
generation.column_rules— Pattern-based rules applied across all tables. Patterns use glob syntax (*for prefix/suffix matching). Rules are matched against column names.
Column Configuration
Each column override supports these fields:
| Field | Description |
|---|---|
generator |
Override type: one_of, regex, or skip |
prompt |
Direct instruction to the LLM for this column |
values |
List of allowed values for one_of |
format |
Regex pattern for regex |
Column Rules
Column rules apply instructions based on column name patterns across all tables:
column_rules:
- pattern: "*_sku"
generator: regex
format: "[A-Z]{3}-[0-9]{6}"
- pattern: "*_status"
generator: skip
Patterns support * as a wildcard at the start, end, or both:
*_email— matches columns ending with_emailphone*— matches columns starting withphone*name*— matches columns containingnamestatus— exact match only