System Architecture
ConversionOS is organized into six layers, each building on the previous one. Data flows left-to-right through the pipeline, with a measurement feedback loop that continuously improves model performance.
Layer Breakdown
1. Data Ingestion
All source data is ingested into BigQuery via batch ETL or streaming pipelines:
- Web/App Events — GA4, server-side tracking, custom event streams
- CRM Records — Salesforce, HubSpot, or custom CRM via scheduled sync
- Ad Platform Data — Google Ads, Meta Ads via API connectors
- Call Center / Support — Disposition codes, talk time, resolution data
- Third-Party — Identity resolution (Adstra), demographic appends, credit data
2. Feature Engineering
Raw data is transformed into ML-ready features inside BigQuery:
-- Example: 30-day engagement decay feature
SELECT
customer_id,
SAFE_DIVIDE(
COUNTIF(event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)),
COUNTIF(event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
) AS engagement_recency_ratio
FROM events
GROUP BY customer_id
Feature categories:
- Behavioral — Session frequency, page depth, content affinity
- Transactional — Purchase recency, frequency, monetary value (RFM)
- Engagement — Email open/click decay, call center interactions
- Demographic — Appended from identity resolution partners
- Competitive — Service area competition density, pricing signals
3. Model Training
Models are trained using BigQuery ML (in-warehouse) or exported to Vertex AI for advanced tuning:
- Primary model: XGBoost classifier for propensity scoring
- Challenger model: LightGBM for A/B comparison
- Baseline: Logistic regression for explainability benchmarks
- Retraining cadence: Weekly with automated drift detection
4. Scoring & Segmentation
Trained models score every prospect/customer on a daily cadence:
- Raw propensity scores (0.0 - 1.0) are computed
- Scores are bucketed into tiers (High / Medium / Low / Exclude)
- Tier boundaries are set using business-rule thresholds calibrated to conversion rates
- Scores are joined with customer profiles and pushed to the CDP
5. CDP Activation
The CDP (ActionIQ, Segment, Treasure Data) receives scored profiles and:
- Builds 200+ audience segments from ML score combinations
- Syncs audiences to ad platforms (Google Ads, Meta) for targeting
- Feeds lifecycle triggers to marketing automation (SFMC) for journey orchestration
- Manages suppression lists and frequency capping
6. Measurement & Feedback
Attribution and incrementality testing close the loop:
- Multi-touch attribution distributes conversion credit across channels
- Incrementality tests (geo-lift, PSA holdouts) validate true lift
- Model performance monitoring tracks score calibration drift
- Results feed back into feature engineering and model retraining
Three-Segment Propensity Architecture
ConversionOS uses a three-model ensemble for comprehensive scoring:
| Model | Target | Refresh | Primary Use |
|---|---|---|---|
| Acquisition Propensity | Prospect → Customer conversion | Daily | Ad targeting, lead prioritization |
| Engagement Propensity | Active → Highly Engaged | Daily | Upsell campaigns, content personalization |
| Churn Propensity | Active → Churned | Daily | Retention triggers, proactive outreach |
Each model feeds independent scores to the CDP, where they combine into composite audiences (e.g., "High acquisition propensity + High LTV prediction" = premium targeting tier).