analyzer Module¶
The analyzer module constitutes the central analytical framework of the CreativeDynamics library, implementing advanced time-series analysis through path signature methodologies. The module encompasses a detailed suite of functions for data acquisition, preprocessing, change-point detection via signature-based algorithms, trend analysis across identified segments, visualisation, and structured report generation.
- creativedynamics.core.analyzer.analyze_all_items(item_time_series: Dict[str, DataFrame], metrics: Tuple[str, ...] = ('ctr', 'cpc'), window_size: int = 7, threshold: float = 1.5, signature_depth: int = 4, method: str = 'auto', spend_column: str = 'amount_spent_gbp', clicks_column: str = 'link_clicks', plot_output_dir: str | None = None) Dict[str, Dict[str, Dict[str, Any]]][source]¶
Runs the core analysis pipeline on a dictionary of time-series DataFrames.
This is the main analysis function of the CreativeDynamics library, implementing a detailed four-phase analysis pipeline using rough path theory and signature methods to detect creative fatigue patterns in advertising performance data.
The analysis pipeline consists of: 1. Phase 1: Change point detection using path signatures and trend classification 2. Phase 2: Spend efficiency calculation (CPC overspending analysis) 3. Phase 3: Engagement performance calculation (CTR decline analysis) 4. Phase 4: Results finalisation, metadata addition, and plot generation
- Parameters:
item_time_series – A dictionary mapping item IDs (creative identifiers) to their corresponding time-series DataFrames. Each DataFrame must be sorted by date and contain a ‘day’ column plus the relevant metric columns (‘ctr’, ‘cpc’, ‘impressions’, etc.). Typically prepared by prepare_item_time_series.
metrics – A tuple of metric names to analyze. Supported metrics are ‘ctr’ (click-through rate) and ‘cpc’ (cost-per-click). Each metric undergoes separate change point detection and wastage analysis.
window_size – The sliding window size (in days) for signature analysis. Controls the temporal resolution of change point detection. Larger values provide more stable detection but may miss short-term changes.
threshold – The sensitivity threshold for detecting change points in the signature distance metric. Higher values reduce sensitivity, detecting only major pattern changes. Lower values increase sensitivity.
signature_depth – The truncation depth for path signature calculations. Controls the complexity of patterns captured by the signature method. Higher depths capture more complex patterns but increase computational cost.
method – Legacy parameter maintained for backward compatibility. The roughpy library is always used for signature calculations regardless of this value.
spend_column – The column name containing advertising spend data in GBP, used for spend efficiency calculations. Must exist in the DataFrames.
clicks_column – The column name containing click count data, used for engagement performance calculations. Must exist in the DataFrames.
plot_output_dir – Optional directory path where analysis plots will be saved. If None, plots are saved to a default ‘output/plots’ directory structure. Individual item plots and combined wastage analyses are generated.
- Returns:
{item_id: {metric: {analysis_details}}} where analysis_details contains:
change_points: List of time indices where significant pattern changes occur
segment_trends: List of (start_date, end_date, trend_classification) tuples
overall_trend: Dominant trend classification across the entire time series
distances: Signature distance values used for change point detection
threshold_value: Actual threshold used for change point detection
pattern_change_detected: Boolean indicating if declining patterns were found
spend_efficiency (for CPC analysis): Dictionary containing financial metrics: - actual_wastage_gbp: Excess spending vs benchmark CPC - benchmark_cpc: Optimal CPC benchmark value - benchmark_period_start/end: Date range of benchmark period - overspend_periods: List of periods with overspending - calculation_status: Status of calculation - metric_type: ‘financial’ - total_spend: Total advertising spend for the item
engagement_performance (for CTR analysis): Dictionary containing performance metrics: - engagement_lost_clicks: PRIMARY - Foregone clicks vs benchmark (integer) - ctr_decline_percentage_points: CTR decline in percentage points - ctr_benchmark: Optimal CTR benchmark value - ctr_actual_average: Actual average CTR in period - benchmark_period_start/end: Date range of benchmark period - reference_value_gbp: OPTIONAL - GBP translation for comparison only - calculation_status: Status of calculation - metric_type: ‘performance’ - note: Explanation that reference_value_gbp is for comparison only
analysis_metadata: Dictionary containing cross-metric analysis: - double_counting_risk: Risk level (‘Low’, ‘Medium’, ‘High’, ‘Unknown’) - warning: Reminder that metrics should not be combined
plot_file: File path of generated analysis plot
- Return type:
A nested dictionary containing detailed analysis results with the structure
- Raises:
CreativeDynamicsError – Base exception for library-specific errors during analysis.
DataValidationError – When input data fails validation requirements.
SignatureCalculationError – When path signature calculations encounter issues.
ProcessingError – When analysis pipeline encounters processing failures.
Example:
from creativedynamics.data.loader import prepare_item_time_series from creativedynamics.core.analyzer import analyze_all_items # Load and prepare time series data data = load_data('campaign_data.csv') items_dict, excluded = prepare_item_time_series(data) # Run detailed analysis pipeline results = analyze_all_items( items_dict, metrics=('ctr', 'cpc'), window_size=7, threshold=1.5, plot_output_dir='./analysis_plots' ) # Access results for a specific item item_results = results['CREATIVE_001'] # Financial metric (CPC) actual_overspend = item_results['cpc']['spend_efficiency']['actual_wastage_gbp'] # Performance metric (CTR) - clicks primary, GBP secondary engagement_gap = item_results['ctr']['engagement_performance']['engagement_lost_clicks'] reference_value = item_results['ctr']['engagement_performance']['reference_value_gbp'] # Overall trend and metadata overall_trend = item_results['cpc']['overall_trend'] risk_level = item_results['analysis_metadata']['double_counting_risk']
Note
This function implements the core CreativeDynamics methodology combining rough path theory with practical advertising analytics. The signature-based change point detection is particularly effective at identifying subtle creative fatigue patterns that traditional statistical methods might miss. The multi-phase approach ensures robust wastage calculations while accounting for metric interdependencies through double-counting risk assessment.
Performance scales approximately O(n*m*w) where n=items, m=metrics, w=window_size. For large datasets, consider processing in batches or increasing window_size to reduce computational overhead.