analyzer Module

The analyzer module constitutes the central analytical framework of the CreativeDynamics library, implementing advanced time-series analysis through path signature methodologies. The module encompasses a detailed suite of functions for data acquisition, preprocessing, change-point detection via signature-based algorithms, trend analysis across identified segments, visualisation, and structured report generation.

creativedynamics.core.analyzer.analyze_all_items(item_time_series: Dict[str, DataFrame], metrics: Tuple[str, ...] = ('ctr', 'cpc'), window_size: int = 7, threshold: float = 1.5, signature_depth: int = 4, method: str = 'auto', spend_column: str = 'amount_spent_gbp', clicks_column: str = 'link_clicks', plot_output_dir: str | None = None) Dict[str, Dict[str, Dict[str, Any]]][source]

Runs the core analysis pipeline on a dictionary of time-series DataFrames.

This is the main analysis function of the CreativeDynamics library, implementing a detailed four-phase analysis pipeline using rough path theory and signature methods to detect creative fatigue patterns in advertising performance data.

The analysis pipeline consists of: 1. Phase 1: Change point detection using path signatures and trend classification 2. Phase 2: Spend efficiency calculation (CPC overspending analysis) 3. Phase 3: Engagement performance calculation (CTR decline analysis) 4. Phase 4: Results finalisation, metadata addition, and plot generation

Parameters:
  • item_time_series – A dictionary mapping item IDs (creative identifiers) to their corresponding time-series DataFrames. Each DataFrame must be sorted by date and contain a ‘day’ column plus the relevant metric columns (‘ctr’, ‘cpc’, ‘impressions’, etc.). Typically prepared by prepare_item_time_series.

  • metrics – A tuple of metric names to analyze. Supported metrics are ‘ctr’ (click-through rate) and ‘cpc’ (cost-per-click). Each metric undergoes separate change point detection and wastage analysis.

  • window_size – The sliding window size (in days) for signature analysis. Controls the temporal resolution of change point detection. Larger values provide more stable detection but may miss short-term changes.

  • threshold – The sensitivity threshold for detecting change points in the signature distance metric. Higher values reduce sensitivity, detecting only major pattern changes. Lower values increase sensitivity.

  • signature_depth – The truncation depth for path signature calculations. Controls the complexity of patterns captured by the signature method. Higher depths capture more complex patterns but increase computational cost.

  • method – Legacy parameter maintained for backward compatibility. The roughpy library is always used for signature calculations regardless of this value.

  • spend_column – The column name containing advertising spend data in GBP, used for spend efficiency calculations. Must exist in the DataFrames.

  • clicks_column – The column name containing click count data, used for engagement performance calculations. Must exist in the DataFrames.

  • plot_output_dir – Optional directory path where analysis plots will be saved. If None, plots are saved to a default ‘output/plots’ directory structure. Individual item plots and combined wastage analyses are generated.

Returns:

{item_id: {metric: {analysis_details}}} where analysis_details contains:

  • change_points: List of time indices where significant pattern changes occur

  • segment_trends: List of (start_date, end_date, trend_classification) tuples

  • overall_trend: Dominant trend classification across the entire time series

  • distances: Signature distance values used for change point detection

  • threshold_value: Actual threshold used for change point detection

  • pattern_change_detected: Boolean indicating if declining patterns were found

  • spend_efficiency (for CPC analysis): Dictionary containing financial metrics: - actual_wastage_gbp: Excess spending vs benchmark CPC - benchmark_cpc: Optimal CPC benchmark value - benchmark_period_start/end: Date range of benchmark period - overspend_periods: List of periods with overspending - calculation_status: Status of calculation - metric_type: ‘financial’ - total_spend: Total advertising spend for the item

  • engagement_performance (for CTR analysis): Dictionary containing performance metrics: - engagement_lost_clicks: PRIMARY - Foregone clicks vs benchmark (integer) - ctr_decline_percentage_points: CTR decline in percentage points - ctr_benchmark: Optimal CTR benchmark value - ctr_actual_average: Actual average CTR in period - benchmark_period_start/end: Date range of benchmark period - reference_value_gbp: OPTIONAL - GBP translation for comparison only - calculation_status: Status of calculation - metric_type: ‘performance’ - note: Explanation that reference_value_gbp is for comparison only

  • analysis_metadata: Dictionary containing cross-metric analysis: - double_counting_risk: Risk level (‘Low’, ‘Medium’, ‘High’, ‘Unknown’) - warning: Reminder that metrics should not be combined

  • plot_file: File path of generated analysis plot

Return type:

A nested dictionary containing detailed analysis results with the structure

Raises:
  • CreativeDynamicsError – Base exception for library-specific errors during analysis.

  • DataValidationError – When input data fails validation requirements.

  • SignatureCalculationError – When path signature calculations encounter issues.

  • ProcessingError – When analysis pipeline encounters processing failures.

Example:

from creativedynamics.data.loader import prepare_item_time_series
from creativedynamics.core.analyzer import analyze_all_items

# Load and prepare time series data
data = load_data('campaign_data.csv')
items_dict, excluded = prepare_item_time_series(data)

# Run detailed analysis pipeline
results = analyze_all_items(
    items_dict,
    metrics=('ctr', 'cpc'),
    window_size=7,
    threshold=1.5,
    plot_output_dir='./analysis_plots'
)

# Access results for a specific item
item_results = results['CREATIVE_001']

# Financial metric (CPC)
actual_overspend = item_results['cpc']['spend_efficiency']['actual_wastage_gbp']

# Performance metric (CTR) - clicks primary, GBP secondary
engagement_gap = item_results['ctr']['engagement_performance']['engagement_lost_clicks']
reference_value = item_results['ctr']['engagement_performance']['reference_value_gbp']

# Overall trend and metadata
overall_trend = item_results['cpc']['overall_trend']
risk_level = item_results['analysis_metadata']['double_counting_risk']

Note

This function implements the core CreativeDynamics methodology combining rough path theory with practical advertising analytics. The signature-based change point detection is particularly effective at identifying subtle creative fatigue patterns that traditional statistical methods might miss. The multi-phase approach ensures robust wastage calculations while accounting for metric interdependencies through double-counting risk assessment.

Performance scales approximately O(n*m*w) where n=items, m=metrics, w=window_size. For large datasets, consider processing in batches or increasing window_size to reduce computational overhead.