Skip to main content

Data Types

Sigilweaver uses Polars data types. Understanding these types helps you work with data effectively and avoid type-related errors.

Numeric Types

Integers

TypeSizeRangeUse Case
Int81 byte-128 to 127Small counters, flags
Int162 bytes-32,768 to 32,767Small numbers
Int324 bytes-2.1B to 2.1BMost integer data
Int648 bytes-9.2e18 to 9.2e18Large IDs, timestamps
UInt81 byte0 to 255Byte values, small positive
UInt162 bytes0 to 65,535Port numbers, small positive
UInt324 bytes0 to 4.3BPositive integers
UInt648 bytes0 to 1.8e19Large positive integers

Floating Point

TypeSizePrecisionUse Case
Float324 bytes~7 digitsMemory-constrained, less precision needed
Float648 bytes~15 digitsMost decimal numbers, financial data

Text Types

TypeDescription
StringUTF-8 encoded text (variable length)
Utf8Alias for String

Both types are interchangeable. Use them for any text data.

Boolean

TypeValues
BooleanTrue / False

Temporal Types

TypeDescriptionExample
DateCalendar date (no time)2024-01-15
DatetimeDate + time with nanosecond precision2024-01-15 14:30:00.123456789
TimeTime of day only14:30:00
DurationTime span1 day, 2:30:00

Other Types

TypeDescription
CategoricalEnum-like string values (memory efficient for repeated values)
ListNested list of values
StructNested structure with named fields

Type Inference

When loading data, Sigilweaver infers types automatically:

  • CSV files: Types are inferred by sampling rows
  • Parquet files: Types are stored in the file metadata

You can override inferred types using the Select tool's type casting feature.

Casting Types

Use the Select tool or Formula expressions to convert between types:

In Select Tool

  1. Open the Select tool configuration
  2. Find the column you want to cast
  3. Select the target type from the dropdown

In Formula Expressions

# Cast to integer
pl.col("str_number").cast(pl.Int64)

# Cast to float
pl.col("value").cast(pl.Float64)

# Cast to string
pl.col("id").cast(pl.Utf8)

# Parse date from string
pl.col("date_str").str.to_date()

Type Compatibility

When joining or unioning data, column types should match:

OperationRequirement
Join keysShould be same type (e.g., both Int64)
UnionColumns are upcast if types differ

Automatic Upcasting

Polars automatically upcasts when combining different numeric types:

  • Int32 + Int64 = Int64
  • Float32 + Float64 = Float64
  • Int64 + Float64 = Float64

Choosing Types

Memory Optimization

Choose the smallest type that fits your data:

If your values are...Use
0-255UInt8
Small positive integersUInt16 or UInt32
Integers with negativesInt32 or Int64
Decimal numbersFloat64
Repeated string valuesCategorical

Precision

For financial calculations, use Float64 to minimize rounding errors.

Common Type Errors

"Could not convert to Int64"

The column contains non-numeric values. Clean the data first:

# Filter out non-numeric before casting
pl.col("value").str.contains("^-?\\d+$").is_not_null()

"Type mismatch in join"

Join keys have different types. Cast one to match the other:

pl.col("id").cast(pl.Int64)