Skip to main content

Summarize

Group data and calculate aggregations (sums, averages, counts, etc.).

Sockets

SocketDirectionDescription
inputInputData to summarize
outputOutputAggregated results

Configuration

The Summarize tool shows a table of all available columns. For each column, select an action:

ActionDescriptionWorks On
Group ByGroup rows by this column's valuesAny type
SumCalculate totalNumeric only
MeanCalculate averageNumeric only
MedianCalculate median valueNumeric only
MinFind minimum valueAny type
MaxFind maximum valueAny type
CountCount non-null valuesAny type
Count DistinctCount unique valuesAny type
FirstGet first value in groupAny type
LastGet last value in groupAny type
Std DevCalculate standard deviationNumeric only
VarianceCalculate varianceNumeric only
ConcatConcatenate strings with commaString only

Output Column Names

Each aggregation creates an output column named {Action}_{Column} by default (e.g., Sum_amount).

You can customize output names in the configuration.

How It Works

With Group By Columns

When you select one or more columns as "Group By":

  1. Rows are grouped by unique combinations of those column values
  2. Aggregations are calculated within each group
  3. Output has one row per group

Without Group By (Global Aggregation)

If no Group By columns are selected:

  1. All rows are treated as a single group
  2. Aggregations are calculated across the entire dataset
  3. Output has a single row

Examples

Total Sales by Region

Input:

regionproductamount
EastWidget100
EastGadget150
WestWidget200
WestWidget75

Configuration:

  • region: Group By
  • amount: Sum

Output:

regionSum_amount
East250
West275

Multiple Aggregations

Configuration:

  • region: Group By
  • amount: Sum
  • amount: Count
  • product: Count Distinct

Output:

regionSum_amountCount_amountCountDistinct_product
East25022
West27521

Global Totals (No Group By)

Configuration:

  • amount: Sum
  • amount: Mean
  • amount: Count

Output:

Sum_amountMean_amountCount_amount
525131.254

Distinct Values Only

Use Group By without any aggregations to get unique values:

Configuration:

  • region: Group By

Output:

region
East
West

Aggregation Details

Numeric Aggregations

AggregationBehavior
SumTotal of all values (nulls ignored)
MeanAverage (nulls ignored)
MedianMiddle value when sorted
Std DevStandard deviation
VarianceVariance
Min/MaxMinimum/maximum value

Count Aggregations

AggregationBehavior
CountNumber of non-null values
Count DistinctNumber of unique values (nulls excluded)

String Aggregations

AggregationBehavior
ConcatJoins values with , separator
First/LastFirst or last value in the group

Notes

  • Empty groups: Groups with all null values produce null aggregation results
  • Row order: Group By output order is not guaranteed (sort if needed)
  • Multiple aggregations per column: You can apply multiple aggregations to the same column
  • Null handling: Most aggregations skip null values