Pandas Series

A single-column proxy for a PandasApiTdsFrame.

A Series is conceptually similar to a pandas.Series: it represents one column of a frame and supports element-wise arithmetic, string methods, date-part extraction, and other transformations.

Obtaining a Series

Use bracket notation on a PandasApiTdsFrame. The returned subclass matches the column type. For example, an integer column becomes an IntegerSeries.

Transformations

A Series supports the same operator overloads as the underlying primitive type. For example, an IntegerSeries supports +, -, *, /, %, comparisons, etc. A StringSeries supports .upper(), .lower(), .len(), .startswith(), .contains(), .replace(), concatenation with +, etc. A DateTimeSeries supports .year(), .month(), .day(), etc.

Transforming a Series produces a new Series with an updated expression tree — the original column is never mutated.

Assigning back to the frame

Use bracket assignment (__setitem__) to write a Series back into the frame — either overwriting an existing column or creating a new one.

Constants (int, float, str, bool, date, datetime) and callables (lambda) are also accepted on the right-hand side.

Important

A Series can only be assigned to the same frame it was derived from. Assigning a Series from a different frame raises ValueError.

Window functions on a Series

Certain window functions such as rank() can be called on a Series. The result is a new Series whose values are the window-function output for that column.

Series with applied functions (aggregations or window functions) can also be combined with arithmetic in the same assignment, but only one function call is allowed per expression. If multiple function calls are needed, split them into separate steps.

See also

PandasApiTdsFrame: The parent frame class.
PandasApiGroupbyTdsFrame: Groupby object (returns GroupbySeries when bracket-indexed).

Notes

Differences from pandas:

A Series is not a first-class data container. It is an expression builder that lazily constructs the query. No data is materialised until execute_frame_to_string() or to_pandas() is called.
Cross-frame assignment is not allowed. In pandas you can freely assign a Series from one DataFrame to another (alignment happens on the index); here the Series must originate from the same frame instance. If you need cross-frame assignment, use join or merge.
Applying a function on a computed series expression is not supported in certain cases. For example, (frame['col'] + 5).rank() raises NotImplementedError. Instead, do frame['col'].rank() + 5.

Examples

Download Interactive Notebook

agg

Series.agg(func, axis=0, *args, **kwargs)[source]

Alias for aggregate().

See aggregate() for full documentation.

Return type:: Union[PandasApiTdsFrame, Series]

aggregate

Series.aggregate(func, axis=0, *args, **kwargs)[source]

Aggregate the Series using one or more operations.

Reduce the single column to one or more scalar values. The result is returned as a single-row PandasApiTdsFrame.

Parameters:

func (Union[Callable[..., Union[int, float, str, bool, date, datetime, Decimal, PyLegendPrimitive]], str, ufunc, List[Union[Callable[..., Union[int, float, str, bool, date, datetime, Decimal, PyLegendPrimitive]], str, ufunc]], Mapping[Hashable, Union[Callable[..., Union[int, float, str, bool, date, datetime, Decimal, PyLegendPrimitive]], str, ufunc, List[Union[Callable[..., Union[int, float, str, bool, date, datetime, Decimal, PyLegendPrimitive]], str, ufunc]]]]]) –
Aggregation specification:
- str — a named aggregation ('sum', 'mean', 'min', 'max', 'count', 'std', 'var', plus aliases 'len', 'size').
- callable — a lambda receiving the Series and calling one of its aggregation methods (e.g. lambda x: x.sum()).
- list of str — multiple named aggregations. Result columns are named "agg(col_name)".
- dict — {column_name: agg_spec}. Keys must match the Series’ column name.
axis (Union[int, str]) – Must be 0 or 'index'.

Returns:

A single-row frame with the aggregated value(s).

Return type:

Union[PandasApiTdsFrame, Series]

Raises:

NotImplementedError – If called on a computed Series expression (e.g. (frame['col'] + 5).aggregate('sum')). Assign the expression to a column first, then aggregate.
ValueError – If a dict key does not match the Series’ column name.

See also

agg: Alias for aggregate.
sum: Sum of the column.
mean: Mean of the column.

Notes

Differences from pandas:

In pandas, Series.aggregate can return a scalar, a Series, or a DataFrame depending on the input. Here the result is always a single-row PandasApiTdsFrame.
Aggregation on a computed Series expression is not supported. Assign the expression to the frame first.
When func is a dict, keys must exactly match the Series’ column name — no other column names are valid.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

# Single named aggregation
frame["Order Id"].aggregate("sum").to_pandas()

	Order Id
0	8849875

# Multiple aggregations via a list
frame["Order Id"].aggregate(["sum", "min", "max"]).to_pandas()

	sum(Order Id)	min(Order Id)	max(Order Id)
0	8849875	10248	11077

# Lambda aggregation
frame["Order Id"].aggregate(lambda x: x.count()).to_pandas()

	Order Id
0	830

concat_legend_ext

Series.concat_legend_ext(other)[source]

Concatenate this series with another series vertically.

PyLegend extension — not present in pandas.

Performs a UNION ALL of this series with other. Both series must have compatible schemas (same column name and type).

Parameters:: other (Series) – Another series with the same column name and type.
Returns:: A new series containing rows from both series.
Return type:: Series
Raises:: ValueError – If the schemas of the two series are incompatible.

Notes

Differences from pandas:

In pandas, pd.concat is a top-level function. Here, concat_legend_ext is a method on a Series and only supports vertical concatenation (UNION ALL) of two single-column series with the same schema.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

s1 = frame.head(3)["Order Id"]
s2 = frame.head(3)["Order Id"]
s1.concat_legend_ext(s2).to_pandas()

	Order Id
0	10248
1	10249
2	10250
3	10248
4	10249
5	10250

count

Series.count(axis=0, numeric_only=False, **kwargs)[source]

Return the count of non-null values in the Series.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
numeric_only (bool) – Must be False. True is not supported.

Returns:

A single-row frame with the count.

Return type:

Union[PandasApiTdsFrame, Series]

Notes

Equivalent to series.aggregate("count"). Maps to SQL COUNT(column).

Differences from pandas: returns a single-row PandasApiTdsFrame instead of a scalar integer.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].count().to_pandas()

	Order Id
0	830

cume_dist_legend_ext

Series.cume_dist_legend_ext(ascending=True)[source]

Compute the cumulative distribution of this column.

PyLegend extension — not present in pandas.

Maps to SQL CUME_DIST() OVER (ORDER BY col) and Pure cumulativeDistribution.

Parameters:: ascending (bool) – Whether to order in ascending direction.
Returns:: A series containing cumulative distribution values (floats between 0 and 1).
Return type:: Series

See also

rank: Compute ranks.
ntile_legend_ext: Assign rows to numbered buckets.

Notes

Differences from pandas:

This method has no pandas equivalent. CUME_DIST is exposed as a pylegend extension.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["CumeDist"] = frame["Order Id"].cume_dist_legend_ext()
frame.head(5).to_pandas()

	Order Id	Order Date	Required Date	Shipped Date	Ship Name	CumeDist
0	10248	1996-07-04	1996-08-01	1996-07-16	Vins et alcools Chevalier	0.001205
1	10249	1996-07-05	1996-08-16	1996-07-10	Toms Spezialitäten	0.00241
2	10250	1996-07-08	1996-08-05	1996-07-12	Hanari Carnes	0.003614
3	10251	1996-07-08	1996-08-05	1996-07-15	Victuailles en stock	0.004819
4	10252	1996-07-09	1996-08-06	1996-07-11	Suprêmes délices	0.006024

expanding

Series.expanding(min_periods=1, axis=0, method=None, order_by=None, ascending=True)[source]

Create an expanding (cumulative) window on this column.

An expanding window includes all rows from the start of the frame up to the current row, enabling running totals, running averages, and similar cumulative calculations on a single column.

Parameters:

min_periods (int) – Minimum number of observations required to produce a value.
axis (Union[int, str]) – Only 0 / 'index' is supported.
method (Optional[str]) – Not supported. Must be None.
order_by (Union[str, Sequence[str], None]) – Column(s) to order by within the window.
ascending (Union[bool, Sequence[bool]]) – Sort direction(s) for order_by columns.

Returns:

A window series on which aggregates (sum, mean, etc.) can be called.

Return type:

WindowSeries

Raises:

NotImplementedError – If axis is not 0, or method is not None.

See also

rolling: Fixed-size sliding window on a column.
window_frame_legend_ext: Custom window specification.

Notes

Differences from pandas:

order_by and ascending are pylegend extensions not present in pandas.
axis=1 and method are not supported.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].expanding(
    order_by="Order Id"
).sum().to_pandas().head(5)

	Order Id
0	10248
1	20497
2	30747
3	40998
4	51250

max

Series.max(axis=0, skipna=True, numeric_only=False, **kwargs)[source]

Return the maximum of the Series values.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
skipna (bool) – Must be True. False is not supported.
numeric_only (bool) – Must be False. True is not supported.

Returns:

A single-row frame with the maximum value.

Return type:

Union[PandasApiTdsFrame, Series]

Notes

Equivalent to series.aggregate("max"). Works on string columns as well (lexicographic maximum).

Differences from pandas: returns a single-row PandasApiTdsFrame instead of a scalar.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].max().to_pandas()

	Order Id
0	11077

mean

Series.mean(axis=0, skipna=True, numeric_only=False, **kwargs)[source]

Return the mean of the Series values.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
skipna (bool) – Must be True. False is not supported.
numeric_only (bool) – Must be False. True is not supported.

Returns:

A single-row frame with the mean.

Return type:

Union[PandasApiTdsFrame, Series]

Notes

Equivalent to series.aggregate("mean"). Maps to SQL AVG().

Differences from pandas: returns a single-row PandasApiTdsFrame instead of a scalar.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].mean().to_pandas()

	Order Id
0	10662.5

min

Series.min(axis=0, skipna=True, numeric_only=False, **kwargs)[source]

Return the minimum of the Series values.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
skipna (bool) – Must be True. False is not supported.
numeric_only (bool) – Must be False. True is not supported.

Returns:

A single-row frame with the minimum value.

Return type:

Union[PandasApiTdsFrame, Series]

Notes

Equivalent to series.aggregate("min"). Works on string columns as well (lexicographic minimum).

Differences from pandas: returns a single-row PandasApiTdsFrame instead of a scalar.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].min().to_pandas()

	Order Id
0	10248

ntile_legend_ext

Series.ntile_legend_ext(num_buckets, ascending=True)[source]

Assign rows to numbered buckets based on this column’s ordering.

PyLegend extension — not present in pandas.

Maps to SQL NTILE(n) OVER (ORDER BY col) and Pure ntile.

Parameters:

num_buckets (int) – Number of buckets to distribute rows into.
ascending (bool) – Whether to order in ascending direction.

Returns:

A series containing bucket numbers (1-based).

Return type:

Series

See also

rank: Compute ranks.
cume_dist_legend_ext: Cumulative distribution.

Notes

Differences from pandas:

This method has no pandas equivalent. NTILE is exposed as a pylegend extension.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Quartile"] = frame["Order Id"].ntile_legend_ext(4)
frame.head(5).to_pandas()

	Order Id	Order Date	Required Date	Shipped Date	Ship Name	Quartile
0	10248	1996-07-04	1996-08-01	1996-07-16	Vins et alcools Chevalier	1
1	10249	1996-07-05	1996-08-16	1996-07-10	Toms Spezialitäten	1
2	10250	1996-07-08	1996-08-05	1996-07-12	Hanari Carnes	1
3	10251	1996-07-08	1996-08-05	1996-07-15	Victuailles en stock	1
4	10252	1996-07-09	1996-08-06	1996-07-11	Suprêmes délices	1

rank

Series.rank(axis=0, method='min', numeric_only=False, na_option='bottom', ascending=True, pct=False)[source]

Compute the rank of values in this Series.

Return a new Series containing the rank of each value. The result can be assigned back to the parent frame as a new column, or executed directly as a standalone single-column query.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
method (str) –
How to rank equal values:
- 'min' : Lowest rank in the group of ties (SQL RANK()).
- 'first' : Ranks by order of appearance (SQL ROW_NUMBER()).
- 'dense' : Like 'min' but no gaps (SQL DENSE_RANK()).
numeric_only (bool) – If True, only rank numeric columns.
na_option (str) – Only 'bottom' is supported.
ascending (bool) – Whether to rank in ascending order.
pct (bool) – If True, compute percentage ranks (SQL PERCENT_RANK()). Returns a FloatSeries. Only supported with method='min'.

Returns:

An IntegerSeries (or FloatSeries when pct=True) containing the ranks.

Return type:

Series

Raises:

NotImplementedError – If called on a computed Series expression (e.g. (frame['col'] + 5).rank()). Call rank() first, then apply arithmetic. If method is not 'min', 'first', or 'dense'. If na_option is not 'bottom'. If pct=True with a method other than 'min'.

See also

PandasApiTdsFrame.rank: Rank all columns of a frame.
PandasApiGroupbyTdsFrame.rank: Rank within groups.

Notes

Differences from pandas:

The 'average' and 'max' methods are not supported.
na_option only supports 'bottom'.
pct=True is only supported with method='min'.
The result is a Series, not a pandas.Series. It can be assigned to the frame or executed directly.
Calling rank() on a computed Series expression is not supported. Do the rank first, then apply arithmetic: frame['col'].rank() + 5.
Only one window-function call is allowed per expression. To combine multiple, use separate assignments.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

# Execute a ranked Series directly as a single-column query
frame["Order Id"].rank().to_pandas().head()

	Order Id
0	1
1	2
2	3
3	4
4	5

# Assign a rank column to the frame
frame["Order Rank"] = frame["Order Id"].rank()
frame.head(5).to_pandas()

	Order Id	Order Date	Required Date	Shipped Date	Ship Name	Order Rank
0	10248	1996-07-04	1996-08-01	1996-07-16	Vins et alcools Chevalier	1
1	10249	1996-07-05	1996-08-16	1996-07-10	Toms Spezialitäten	2
2	10250	1996-07-08	1996-08-05	1996-07-12	Hanari Carnes	3
3	10251	1996-07-08	1996-08-05	1996-07-15	Victuailles en stock	4
4	10252	1996-07-09	1996-08-06	1996-07-11	Suprêmes délices	5

rolling

Series.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None, step=None, method=None, order_by=None, ascending=True)[source]

Create a fixed-size sliding window on this column.

A rolling window includes a fixed number of preceding rows for each row, enabling moving averages, moving sums, and similar calculations on a single column.

Parameters:

window (int) – Size of the moving window (number of rows).
min_periods (Optional[int]) – Minimum observations required. Defaults to window.
center (bool) – Not supported. Must be False.
win_type (Optional[str]) – Not supported. Must be None.
on (Optional[str]) – Not supported. Must be None.
axis (Union[int, str]) – Only 0 / 'index' is supported.
closed (Optional[str]) – Not supported. Must be None.
step (Optional[int]) – Not supported. Must be None.
method (Optional[str]) – Not supported. Must be None.
order_by (Union[str, Sequence[str], None]) – Column(s) to order by within the window.
ascending (Union[bool, Sequence[bool]]) – Sort direction(s) for order_by columns.

Returns:

A window series on which aggregates (sum, mean, etc.) can be called.

Return type:

WindowSeries

Raises:

NotImplementedError – If center, win_type, on, closed, step, or method are set to non-default values, or axis is not 0.

See also

expanding: Expanding (cumulative) window on a column.
window_frame_legend_ext: Custom window specification.

Notes

Differences from pandas:

order_by and ascending are pylegend extensions not present in pandas.
center, win_type, on, closed, step, and method are not supported.
axis=1 is not supported.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].rolling(
    window=3, order_by="Order Id"
).mean().to_pandas().head(5)

	Order Id
0	10248.0
1	10248.5
2	10249.0
3	10250.0
4	10251.0

std

Series.std(axis=0, skipna=True, ddof=1, numeric_only=False, **kwargs)[source]

Return the standard deviation of the Series values.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
skipna (bool) – Must be True. False is not supported.
ddof (int) – Degrees of freedom. 1 for sample standard deviation (STDDEV_SAMP), 0 for population standard deviation (STDDEV_POP).
numeric_only (bool) – Must be False. True is not supported.

Returns:

A single-row frame with the standard deviation.

Return type:

Union[PandasApiTdsFrame, Series]

Raises:

NotImplementedError – If ddof is not 0 or 1, or if skipna, numeric_only, or **kwargs are set to unsupported values.

Notes

Equivalent to series.aggregate("std") (ddof=1) or series.aggregate("std_dev_population") (ddof=0). Maps to SQL STDDEV_SAMP() or STDDEV_POP().

Differences from pandas: returns a single-row PandasApiTdsFrame instead of a scalar. Only ddof=0 and ddof=1 are supported.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].std().to_pandas()

	Order Id
0	239.744656

sum

Series.sum(axis=0, skipna=True, numeric_only=False, min_count=0, **kwargs)[source]

Return the sum of the Series values.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
skipna (bool) – Must be True. False is not supported (SQL aggregation ignores nulls by default).
numeric_only (bool) – Must be False. True is not supported.
min_count (int) – Must be 0. Non-zero values are not supported.

Returns:

A single-row frame with the sum.

Return type:

Union[PandasApiTdsFrame, Series]

Notes

Equivalent to series.aggregate("sum").

Differences from pandas: returns a single-row PandasApiTdsFrame instead of a scalar. skipna=False, numeric_only=True, and min_count != 0 are not supported.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].sum().to_pandas()

	Order Id
0	8849875

var

Series.var(axis=0, skipna=True, ddof=1, numeric_only=False, **kwargs)[source]

Return the variance of the Series values.

Parameters:

axis (Union[int, str]) – Must be 0 or 'index'.
skipna (bool) – Must be True. False is not supported.
ddof (int) – Degrees of freedom. 1 for sample variance (VAR_SAMP), 0 for population variance (VAR_POP).
numeric_only (bool) – Must be False. True is not supported.

Returns:

A single-row frame with the variance.

Return type:

Union[PandasApiTdsFrame, Series]

Raises:

NotImplementedError – If ddof is not 0 or 1, or if skipna, numeric_only, or **kwargs are set to unsupported values.

Notes

Equivalent to series.aggregate("var") (ddof=1) or series.aggregate("variance_population") (ddof=0). Maps to SQL VAR_SAMP() or VAR_POP().

Differences from pandas: returns a single-row PandasApiTdsFrame instead of a scalar. Only ddof=0 and ddof=1 are supported.

Examples

Download Interactive Notebook

import pylegend
frame = pylegend.samples.pandas_api.northwind_orders_frame()

frame["Order Id"].var().to_pandas()

	Order Id
0	57477.5

window_frame_legend_ext

Series.window_frame_legend_ext(frame_spec=<pylegend.core.language.pandas_api.pandas_api_frame_spec.RowsBetween object>, order_by=None, ascending=True)[source]

Create a custom window specification on this column.

PyLegend extension — not present in pandas.

The frame_spec argument controls the ROWS BETWEEN or RANGE BETWEEN clause.

Parameters:

frame_spec (Optional[FrameSpec]) – A window-frame specification created via rows_between() or range_between().
order_by (Union[str, Sequence[str], None]) – Column(s) to order by within the window.
ascending (Union[bool, Sequence[bool]]) – Sort direction(s) for order_by columns.

Returns:

A window series on which aggregates can be called.

Return type:

WindowSeries

Raises:

TypeError – If frame_spec is not a RowsBetween or RangeBetween.

See also

expanding: Cumulative window on a column.
rolling: Fixed-size sliding window on a column.

Notes

Differences from pandas:

This method has no pandas equivalent. It is a pylegend extension for fine-grained control over the SQL ROWS BETWEEN / RANGE BETWEEN clause.

Examples

Download Interactive Notebook

import pylegend
from pylegend.core.language.pandas_api.pandas_api_frame_spec import (
    RowsBetween,
)
frame = pylegend.samples.pandas_api.northwind_orders_frame()

spec = RowsBetween(-2, 0)
frame["Order Id"].window_frame_legend_ext(
    spec, order_by="Order Id"
).sum().to_pandas().head(5)

	Order Id
0	10248
1	20497
2	30747
3	30750
4	30753

PyLegend

Table of Contents

Quick Search

Pandas Series

agg

aggregate

concat_legend_ext

count

cume_dist_legend_ext

expanding

max

mean

min

ntile_legend_ext

rank

rolling

std

sum

var

window_frame_legend_ext