tabular-remap (Transform Node)
The Tabular Remap node transforms a tabular dataset (CSV or TSV) into a new format by applying column-level operations defined in its configuration.
This node is ideal for standardizing schema, renaming columns, creating derived metrics, or injecting constants before data aggregation.
Configuration Schema
| Property | Type | Required | Description |
|---|---|---|---|
inputFormat | "auto" | "csv" | "tsv" | ✖ | How to parse the input file. Defaults to "auto". |
outputFormat | "csv" | "tsv" | ✖ | Format of the output. Defaults to "csv". |
failOnMissingColumns | boolean | ✖ | If true, the node fails when a rename source column does not exist. |
treatEmptyAsNone | boolean | ✖ | When enabled, empty strings are interpreted as None in transform expressions. |
columns | array<object> | ✔ | List of column operations (rename, transform, constant). Evaluated by ascending priority. |
Each columns[] entry supports:
| Field | Required | Description |
|---|---|---|
name | ✔ | Name of the output column. |
priority | ✖ | Execution order (lower values run first). Defaults to insertion order. |
action | ✔ | "rename", "transform", or "constant". |
source | only for rename | Source column name to copy from. |
expr | only for transform | Python expression evaluated against the input row (variable: row). |
value | only for constant | Static value assigned to the column. |
Example Configuration
{
"inputFormat": "auto",
"outputFormat": "csv",
"failOnMissingColumns": true,
"treatEmptyAsNone": true,
"columns": [
{ "name": "timestamp", "priority": 10, "action": "rename", "source": "time" },
{ "name": "driver", "priority": 20, "action": "rename", "source": "driver_name" },
{ "name": "speed_kph", "priority": 30, "action": "transform", "expr": "float(row['speed_mps']) * 3.6 if row['speed_mps'] is not None else None" },
{ "name": "unit", "priority": 40, "action": "constant", "value": "kph" }
]
}Example Input and Output
Input CSV:
time,driver_name,speed_mps
2025-01-01T00:00:00Z,Alice,10
2025-01-01T00:00:01Z,Bob,12.5Output CSV:
timestamp,driver,speed_kph,unit
2025-01-01T00:00:00Z,Alice,36.0,kph
2025-01-01T00:00:01Z,Bob,45.0,kphTransform Expression Context
Transform expressions execute within a safe, sandboxed Python environment:
- Access the current row as
row['column_name']. - Built-ins available:
math,float,int,str,len,min,max,sum,abs,round. - Empty strings become
NoneiftreatEmptyAsNoneis true.
Example:
{ "name": "speed_category", "action": "transform", "expr": "'fast' if float(row['speed_kph']) > 40 else 'slow'" }Example: Mixed Operations
Configuration:
{
"columns": [
{ "name": "C1", "priority": 10, "action": "rename", "source": "a" },
{ "name": "C2", "priority": 20, "action": "transform", "expr": "int(row['b']) + 5" },
{ "name": "C3", "priority": 30, "action": "constant", "value": "processed" }
]
}Input CSV:
a,b
1,2
3,4Output CSV:
C1,C2,C3
1,7,processed
3,9,processedError Handling
| Scenario | Behavior |
|---|---|
Missing required column (when failOnMissingColumns=true) | Node fails with descriptive log message. |
| Invalid Python expression | Node fails, logs the exception. |
| Empty input payload | Node fails gracefully and aborts downstream processing. |
Tips & Best Practices
- Define explicit
priorityvalues if later columns depend on earlier ones. - Keep expressions short — complex logic should be precomputed upstream.
- Always quote string literals in expressions.
- Use
treatEmptyAsNoneto simplify numeric casts in mixed data sources. - Ensure downstream nodes expect the same delimiter as
outputFormat.