Skip to main content
Analytics is a class-method facade — you never instantiate it. Auth flows through the module-level vectorshift.api_key, exactly like Pipeline, Table, and KnowledgeBase. Every entry point has an async sibling prefixed with a.
from vectorshift import Analytics
from vectorshift.analytics import (
    EventField, EventKind, EventStatus, GroupBy, Interval,
    AggregationField, AggregationOp, AggregationOperation, Percentile,
    Filter, FieldFilter, FilterGroup, LogicalOp, ExportFormat,
)

Entry point

query

Analytics.query(
    *,
    object: Any | list[Any] | None = None,
    object_type: str | None = None,
    object_ids: list[str] | None = None,
    kinds: list[EventKind] | None = None,
    limit: int = 50,
    offset: int = 0,
) -> Query
Build a Query scoped by object and (optionally) event kind. All arguments are keyword-only. aquery returns a Query whose terminals route to the a* async pairs. Parameters
object
Any
default:"None"
A single resource instance (Pipeline, Chatbot, Agent, …). The SDK reads its .id and derives object_type from the class name. Scope is one object — passing more than one resolves to multiple ids and raises AnalyticsInvalidQuery; query each separately and merge client-side.
object_type
Optional[str]
default:"None"
Scope to all objects of a type — e.g. "pipeline", "chatbot", "form", "voicebot", "session", "portal".
object_ids
Optional[list[str]]
default:"None"
A single raw object id when you don’t have the typed instance, e.g. ["cb_123"]. Passing more than one id raises AnalyticsInvalidQuery — the engine matches one id per scope filter, so query each separately and merge results client-side.
kinds
Optional[list[EventKind]]
default:"None"
Restrict to one or more EventKind span types. Overrides any kinds derived from object=.
limit
int
default:"50"
Default page size for paginated terminals (events, traces, table). Override per-call.
offset
int
default:"0"
Default offset for paginated terminals.
Returns
returns
Query
A chainable Query. There is no automatic time window — bound the query with .where(EventField.EVENT_START_TIME …).

Single-object lookups

These read one object by id and sit outside the Query chain.

trace

Analytics.trace(
    trace_id: str,
    event_ids: list[str] | None = None,
    kinds: list[EventKind] | None = None,
) -> Trace
Fetch the full event tree for one trace. Optional event_ids / kinds narrow the returned set. Raises AnalyticsNotFound if the trace doesn’t exist.

event

Analytics.event(event_id: str, trace_id: str | None = None) -> Event
Fetch one Event by id. Pass trace_id to scope the lookup.

run

Analytics.run(
    object: Any | None = None,
    object_type: str | None = None,
    object_id: str | None = None,
    run_id: str | None = None,
) -> RunData
Fetch per-run interface detail (a chatbot conversation, form submission, voicebot transcript, session, etc.). run_id is required; pass either object= or both object_type= + object_id=. Returns a RunData with exactly one interface key populated.

Refining a query

Each refining method returns a fresh Query (the dataclass is immutable).

where

Query.where(
    *positional: Filter | FieldFilter | FilterGroup,
    logical_op: LogicalOp = LogicalOp.AND,
    **kwargs: Any,
) -> Query
Add filters in any of three intermixable forms:
  • Positional Filter / FieldFilter / FilterGroup instances, or the results of operator overloads (EventField.X > value).
  • Equality kwargs resolved against the alias table — status="failure", interface_type="chatbot", trace_id=…, source=…, caller=…, interface_name=…, error_message=…, session_name=…, session_users=…, session_source=…, event_start_time=…, event_end_time=….
Positional args inside one .where() combine via logical_op; multiple .where() calls AND together at the top level. logical_op=LogicalOp.OR wraps that call’s positional args in an OR group.
Time scoping. The SDK lifts the first > / >= / < / <= comparison on EventField.EVENT_START_TIME into the request’s top-level start_time / end_time fields. Datetimes must be timezone-aware — a naive datetime raises AnalyticsInvalidQuery.
EventField.LATENCY, EventField.MODEL_ID, and EventField.NODE_TYPE are aggregation / group-by dimensions only — they have no filterable column on the wire and raise AnalyticsInvalidQuery if used in .where(...). != is not supported (the proto has no NEQ operator) and raises NotImplementedError.

group_by

Query.group_by(
    slice_: GroupBy | str | list[GroupBy | str],
    interval: Interval | str | None = None,
) -> _GroupedQuery
Bucket subsequent aggregations by one or more GroupBy dimensions. GroupBy.TIME requires an interval. Returns a grouped query that re-exposes every aggregation terminal:
  • Single dimension → the aggregation returns a flat dict[str, float] keyed by slice label.
  • Multiple dimensions → returns an AggregationResult with nested buckets.
A grouped query also passes through the non-aggregation terminals (events, count, table, traces, export, export_and_wait).

Aggregation terminals

Each has an a* async sibling. On an ungrouped query they return a scalar; on a group_by(...) they return a dict or AggregationResult.
Aggregations don’t paginate, so every aggregation terminal (sum, mean, min, max, count_distinct, percentile, raw_aggregate) requires a lower time bound — a .where(EventField.EVENT_START_TIME > …) (or >=) predicate. Without one the query would scan all history, so the SDK raises AnalyticsInvalidQuery. count() and events() are not aggregations and don’t require it.

sum / mean / min / max

Query.sum(field: AggregationField | str) -> float
Query.mean(field: AggregationField | str) -> float
Query.min(field: AggregationField | str) -> float
Query.max(field: AggregationField | str) -> float
Scalar aggregation over an AggregationField (or its string value). mean over a range spanning multiple calendar years isn’t supported on an ungrouped query — use group_by(GroupBy.TIME, Interval.YEAR) or narrow the range.

count_distinct

Query.count_distinct(field: AggregationField | str) -> int

percentile

Query.percentile(field: AggregationField | str, n: int) -> float
Percentile aggregation. field must be latency or node_latency; n must be one of 50, 75, 95, 99. Anything else raises AnalyticsInvalidQuery.

raw_aggregate

Query.raw_aggregate(
    operations: list[AggregationOperation],
    group_by: list[GroupBy | str] | None = None,
    interval: Interval | str | None = None,
) -> AggregationResult
Run multiple (field, op) operations — and optional group-by — in a single API round-trip. Each operation is an AggregationOperation. Aggregations can’t filter by data columns (status, error_message, latency, …); filter on scope fields, or pre-filter with .events() / .table() instead.

Count, events, and traces

count

Query.count() -> EventCount
Number of matching events. Returns EventCount.

events

Query.events(limit: int | None = None, offset: int | None = None) -> EventPage
Paginated event list. limit / offset default to the values set on Analytics.query(...); override per-call. Returns an EventPage.

traces

Query.traces(limit: int | None = None, offset: int | None = None) -> list[Trace]
Page through events and group them by trace_id, returning a list of Trace. For the complete tree of a single trace, use Analytics.trace(trace_id).

Data tables and export

table

Query.table(
    columns: list[EventField | str],
    include_interface_data: bool = False,
    return_record_count: bool = False,
    limit: int | None = None,
    offset: int | None = None,
) -> DataTableResult
Project specific EventField columns into a flat data table. Set include_interface_data=True to attach per-run interface payloads, return_record_count=True to populate total_records. Returns a DataTableResult.

export

Query.export(format: ExportFormat | str, file_name: str) -> ExportTask
Kick off a server-side export (CSV, XLSX, or JSON). Fire-and-forget — returns an ExportTask that may already be ready, or pending with a task_id to poll.

export_and_wait

Query.export_and_wait(
    format: ExportFormat | str,
    file_name: str,
    timeout: float = 120,
    poll_interval: float = 1.0,
) -> ExportTask
Blocking variant — polls GET /analytics/exports/{task_id} until the task is ready (returning the ExportTask with a download_url) or fails. Raises AnalyticsExportFailed if the task ends in failed, AnalyticsExportTimeout on timeout.

Filter helpers

Use these when operator overloads and equality kwargs aren’t enough (OR groups, explicit construction). All are frozen dataclasses.

Filter

Top-level scope filter — for scope-category EventField members (OBJECT_ID, TRACE_ID, PROJECT_ID, EXECUTION_ID, …).
Filter.eq(column: EventField, value: Any) -> Filter
Filter.in_(column: EventField, values: list[Any]) -> Filter
Calling Filter.eq on a data-column field raises TypeError with a redirect to FieldFilter.

FieldFilter

Data-column filter — for data-category fields (STATUS, ERROR_MESSAGE, EVENT_START_TIME, INTERFACE_TYPE, …).
FieldFilter.eq(column, value)        FieldFilter.gt(column, value)
FieldFilter.gte(column, value)       FieldFilter.lt(column, value)
FieldFilter.lte(column, value)       FieldFilter.matches(column, pattern)
FieldFilter.includes(column, values)

FilterGroup

FilterGroup(
    filters: list[Filter | FieldFilter | FilterGroup],
    op: LogicalOp = LogicalOp.AND,
)
A group of filters joined by a LogicalOp. Nested FilterGroup trees are not yet representable on the wire in v1 — a flat OR group is the supported case.

EventField operators

Every EventField supports operator overloading, dispatched to the right Filter / FieldFilter automatically:
ExpressionBuilds
field == valueequality
field > / >= / < / <= valuecomparison (data columns)
field.in_([…])IN (scope) / INCLUDES (data)
field.matches("pattern")regex / substring match
field.includes([…])compound INCLUDES
field != valueunsupported — raises NotImplementedError

Types

Terminals return TypedDicts — field names match the JSON the backend returns. Access them as dicts (event["status"], page.get("events", [])).

Event

One annotated event. Common keys (all optional):
span_id
str
parent_span_id
str
trace_id
str
span_kind
str
Wire value, e.g. "pipeline.run" — see EventKind.
span_start_time
str
RFC3339 timestamp.
span_end_time
str
RFC3339 timestamp.
status
str
"success" / "failure" / "in_progress".
object_id
str
error_message
str
interface_type
str
interface_name
str
model_id
str
node_type
str
latency
float
node_latency
float
Also present when populated: object_info, span_attributes, interface_data, source, caller, session_name, session_users, session_source, session_collection, session_ai_source, project_id, execution_id.

EventPage

events
list[Event]
required
pagination_meta
PaginationMeta
default:"—"
results_per_page, page_number, offset, total_records (all optional).

EventCount

count
int
required

AggregationResult

buckets
list[AggregationBucket]
required

AggregationBucket

label
str
Slice value — ISO timestamp, model id, etc.
value
float
The aggregate for this slice.
nested
list[AggregationBucket]
Populated for multi-dimensional group-by.

DataTableResult

rows
list[DataTableRow]
required
Each row is { "values": { <column_label>: <cell> } }.
columns
list[dict[str, str]]
required
Column descriptors.
total_records
int
default:"—"
Present when return_record_count=True.

Trace

trace_id
str
required
events
list[Event]
required

RunData

Per-run interface detail. Exactly one key is populated, matching the resource type:
chatbot
dict[str, Any]
Conversation detail.
form
dict[str, Any]
voicebot
dict[str, Any]
session
dict[str, Any]

ExportTask

task_id
str
status
str
Response status — "success" / "failed".
task_status
str
Task lifecycle — "pending" / "ready" / "failed".
download_url
str
format
str
file_name
str
error
str

AggregationOperation

One (field, op) pair for raw_aggregate(operations=[…]).
field
AggregationField | str
required
op
AggregationOp | Percentile
required
An AggregationOp member or a Percentile(n) value object.

Percentile

n
int
required
One of 50, 75, 95, 99. Any other value raises ValueError.

Enums

AggregationField

runs · tokens · model_costs · latency · ai_credits · errors · node_latency · feedback · users · start_time · end_time · create · add_message

AggregationOp

SUM · MEAN · MIN · MAX · COUNT_DISTINCT (Percentile carries a value — use the Percentile(n) value object, not this enum.)

GroupBy

TIME · MODEL_ID · PIPELINE_ID · FEEDBACK_TYPE · SEARCH_TYPE · NODE_TYPE

Interval

HOUR · DAY · WEEK · MONTH · YEAR

EventField

The unified column enum. Scope fields (for Filter): OBJECT_ID · PARENT_EVENT_ID · OBJECT_INFO · EVENT_ID · TRACE_ID · SESSION_SOURCE · SESSION_COLLECTION · PROJECT_ID · EXECUTION_ID. Data fields (for FieldFilter): SOURCE · EVENT_START_TIME · EVENT_END_TIME · STATUS · EVENT_ATTRIBUTES · CALLER · INTERFACE_DATA · INTERFACE_TYPE · INTERFACE_NAME · ERROR_MESSAGE · SESSION_NAME · SESSION_LAST_MESSAGE_TIME · SESSION_USERS · SESSION_AI_SOURCE. Aggregation / group-by only (not filterable): LATENCY · MODEL_ID · NODE_TYPE.

EventKind

Span types, flattened across all eight entities. Selected members:
  • Pipeline: PIPELINE_ALL · PIPELINE_RUN · PIPELINE_BULK_RUN
  • Chatbot: CHATBOT_ALL · CHATBOT_RUN · CHATBOT_FILE_UPLOAD · CHATBOT_TERMINATE · CHATBOT_MESSAGE_LIKE · CHATBOT_FETCH_KB_DATA
  • Search: SEARCH_ALL · SEARCH_RUN · SEARCH_CHAT_WITH_DOCS · SEARCH_DOC_QNA · SEARCH_MESSAGE_FEEDBACK · SEARCH_TERMINATE · SEARCH_FETCH_KB_DATA
  • Form: FORM_ALL · FORM_RUN · FORM_CHAT
  • Voicebot: VOICEBOT_ALL · VOICEBOT_RUN
  • Bulk job: BULKJOB_RUN · BULKJOB_FETCH_KB_DATA · BULKJOB_IO_NODES
  • Session: SESSION_ALL · SESSION_CREATE · SESSION_ADD_MESSAGE
  • Portal: PORTAL_CREATE

EventStatus

SUCCESS · FAILURE · IN_PROGRESS

ExportFormat

JSON · CSV · XLSX

FieldOp

Scope-filter operators: EQ · IN

FieldFilterOp

Data-column operators: EQ · LT · LTE · GT · GTE · MATCHES · INCLUDES

LogicalOp

AND · OR

Errors

The analytics module raises a small set of typed errors. All subclass AnalyticsError, which subclasses VectorshiftError.
  • AnalyticsNotFound — a trace, event, or export-task lookup returned no result.
  • AnalyticsInvalidQuery — the query was malformed before dispatch (naive datetime, unfilterable field, bad enum, missing kwarg, unsupported aggregation range).
  • AnalyticsExportFailed — an export task ended in failed; carries .task_id and .export_error.
  • AnalyticsExportTimeout — polling for an export exceeded the timeout; also subclasses TimeoutError.
See the top-level Errors page for the broader hierarchy.

What’s next

Overview

Mental model and quick start.

Filters & overloads

Every filter form, side by side.

Events & traces

List events, drill into a trace, fetch run detail.