Skip to main content

Tool Registry

Tool types, interfaces, and how they connect.

Overview

Tools are the core building blocks of Sigilweaver. Each tool:

  • Has a type (e.g., "Filter", "Join")
  • Has sockets (input/output connection points)
  • Has a config (settings specific to that tool)
  • Is implemented in both frontend (UI) and backend (execution)

Architecture

Frontend                              Backend
─────────────────────────────────────────────────
TOOL_DEFINITIONS ──────────────► ToolRegistry
(data/tools.ts) (domain/tools/registry.py)

ToolDefinition ──────────────► BaseTool
(UI/layout/sockets) (execute/schema/validate)

ConfigPanel ──────────────► Implementation
(component) (filter.py, join.py, etc.)

Frontend: ToolDefinition

Every tool needs a frontend definition in frontend/src/data/tools.ts:

interface ToolDefinition {
type: string; // Must match backend tool type
name: string; // Display name in palette
icon: string; // Emoji for palette/node
category: ToolCategory; // "inout" | "preparation" | "join" | "aggregate"
defaultConfig: ToolConfig;
sockets: Socket[];
}

Socket Definition

interface Socket {
id: string; // Unique within tool (e.g., "input", "output-true")
direction: "input" | "output";
type?: string; // Optional label (e.g., "L", "R", "T", "F")
multiple?: boolean; // Allow multiple connections (Union tool)
}

Example: Filter Tool

{
type: 'Filter',
name: 'Filter',
icon: '🔍',
category: 'preparation',
defaultConfig: {
expression: '',
mode: 'Custom',
},
sockets: [
{ id: 'input', direction: 'input' },
{ id: 'output-true', direction: 'output', type: 'T' },
{ id: 'output-false', direction: 'output', type: 'F' }
]
}

The Filter has:

  • One input socket
  • Two output sockets: rows that match (T) and rows that don't (F)
  • Config for the filter expression and mode

Backend: BaseTool

All tools inherit from BaseTool:

from abc import ABC, abstractmethod
import polars as pl

class BaseTool(ABC):
@abstractmethod
async def execute(
self,
config: dict[str, Any],
inputs: dict[str, pl.LazyFrame | list[pl.LazyFrame]]
) -> dict[str, pl.LazyFrame]:
"""Execute tool logic. Keep it lazy!"""
pass

@abstractmethod
async def get_output_schema(
self,
config: dict[str, Any],
input_schemas: dict[str, DataSchema | list[DataSchema]]
) -> dict[str, DataSchema]:
"""Get output schema without execution."""
pass

async def validate_config(self, config: dict[str, Any]) -> list[str]:
"""Return list of validation errors."""
return []

def to_python_code(
self,
tool_id: str,
config: dict[str, Any],
input_vars: dict[str, str],
) -> tuple[list[str], dict[str, str]]:
"""Generate Python code for this tool."""
pass

Key Principles

  1. Lazy Execution: Use LazyFrame throughout. Never call .collect() unless absolutely necessary.
  2. Schema Inference: get_output_schema must work without executing upstream tools.
  3. Validation: Catch config errors before execution.
  4. Code Generation: Support exporting workflows to standalone Python.

Backend: Registry Pattern

Tools register themselves via decorator:

from app.domain.tools.registry import register_tool
from app.domain.tools.base import BaseTool

@register_tool("Filter")
class FilterTool(BaseTool):
async def execute(self, config, inputs):
# Implementation
pass

The registry is populated when app.domain.tools.register is imported (which happens at app startup).

Getting a Tool

from app.domain.tools.registry import ToolRegistry

tool = ToolRegistry.get("Filter") # Returns FilterTool instance
result = await tool.execute(config, inputs)

Available Tools

In/Out

TypeDescriptionSockets
InputLoad data from fileoutput
OutputExport data to fileinput

Preparation

TypeDescriptionSockets
FilterFilter rows by expressioninputoutput-true, output-false
SelectSelect/rename/reorder columnsinputoutput
SortSort rows by columnsinputoutput
FormulaAdd calculated columninputoutput

Join

TypeDescriptionSockets
UnionCombine multiple tables verticallyinput (multiple) → output
JoinJoin two tables on keysinput-left, input-rightoutput-left, output-match, output-right

Aggregate

TypeDescriptionSockets
SummarizeGroup by and aggregateinputoutput

Config Schemas

Each tool has its own config shape. Here are the key ones:

Input

{
"source": "/path/to/file.csv",
"delimiter": ",",
"hasHeader": true,
"encoding": "utf-8"
}

Filter

{
"expression": "pl.col('value') > 100",
"mode": "Custom"
}

The expression is raw Polars syntax evaluated on the backend.

Select

{
"columnConfigs": [
{ "name": "id", "enabled": true, "alias": null },
{ "name": "value", "enabled": true, "alias": "amount" }
]
}

Sort

{
"sortColumns": [
{ "column": "created_at", "ascending": false },
{ "column": "id", "ascending": true }
]
}

Formula

{
"column_name": "total",
"expression": "pl.col('price') * pl.col('quantity')"
}

Union

{
"mode": "ByName"
}

Modes: ByName (match by column name), ByPosition (match by column order).

Join

{
"joinType": "inner",
"leftKeys": ["id"],
"rightKeys": ["user_id"]
}

Join types: inner, left, right, outer, cross.

Summarize

{
"groupByColumns": ["category"],
"aggregations": [
{ "column": "value", "operation": "Sum", "outputName": "total_value" },
{ "column": "id", "operation": "Count", "outputName": "count" }
],
"includeNulls": false
}

Operations: Sum, Mean, Min, Max, Count, First, Last, StdDev, Var.


Adding a New Tool

See Adding Tools for a complete guide. The key steps:

  1. Add ToolDefinition to frontend/src/data/tools.ts
  2. Create config panel component in frontend/src/components/tools/
  3. Create implementation in backend/app/domain/tools/implementations/
  4. Register with @register_tool("YourTool") decorator
  5. Import in backend/app/domain/tools/register.py
  6. Add tests

Type Matching

Socket types must match when connecting:

  • Standard output → Standard input: Always allowed
  • Typed socket → Same type: Allowed (e.g., "L" → "L")
  • Typed socket → Different type: Prevented by UI

The frontend enforces type compatibility during drag-and-drop connections.


Next: Roadmap for project direction and priorities.