Data Warehouse
The centralized dimensional data warehouse for the Civic platform — every module's operational data is ingested, transformed, and served as the single source of truth for analytics and regulatory reporting.
Data Freshness (CDC)
Query Performance
Data Quality Score
Retention
Purpose-Built for Canadian Municipalities
How It Works
The identity journey, step by step
From first registration to golden record resolution — how a resident's identity evolves across the platform.
Nightly ETL Pipeline Run
The warehouse orchestrates overnight data loads from all operational modules.
How it works
At 01:00, the ETL scheduler triggers pipelines in dependency order — conformed dimensions first, then fact tables. Each pipeline extracts incremental changes from source modules via integration-bus CDC, transforms the data into star schema format, validates against quality rules, and loads into the target mart. By 06:00, all marts are refreshed and quality scorecards are updated.
Step 1 of 5
Purpose & Scope
What this module owns
Clear ownership boundaries prevent duplication and ensure every capability has exactly one authoritative home.
Owns
9Delegated to
5Operational data collection
Dashboard rendering
ML model training data
Access control
Open data exports
These capabilities are handled by dedicated modules and consumed via stable API contracts — keeping boundaries clean and ownership unambiguous.
Core Capabilities
What it does
5 capability groups comprising 10 discrete capabilities — each with API surface, business rules, and data ownership.
Drag-and-drop ETL pipeline builder with source connectors for all platform modules, incremental and full load strategies, and pipeline versioning.
Source Connectors
Pre-built connectors for all platform modules with CDC (Debezium) support.
Load Strategies
Incremental and full load strategies — event-triggered or cron-scheduled.
Error Handling
Dead-letter queue (DLQ) for failed records with automated alerting and retry mechanisms.
Pipeline Versioning
Full version history with rollback — track, diff, and revert pipeline definitions.
Cron-based and event-triggered pipeline execution with dependency management and parallel processing.
Cron Scheduling
Standard cron expression scheduling with timezone support for batch ETL jobs.
Event-Triggered
Pipelines triggered by integration-bus CDC events for near-real-time data ingestion.
Parallel Execution
Configurable parallelism — up to 4 concurrent pipelines with resource-aware scheduling.
Dependency Chains
Define pipeline execution order to ensure conformed dimensions load before fact tables.
Real-World Scenarios
Who uses this, and how
4 persona-driven scenarios showing how Data Warehouse works in practice — from resident registration to privacy compliance.
Budget Analyst
Cross-Department Revenue Analysis
Sarah needs to compare revenue collection efficiency across all departments for the past fiscal year to prepare budget recommendations.
Steps
- 1Sarah accesses the Revenue data mart through the BI dashboard
- 2She selects a cross-department revenue comparison report spanning FY2025
- 3The query joins fact_revenue with DIM_DEPARTMENT and DIM_DATE using conformed dimensions
- 4Columnar storage and materialized views return results in under 5 seconds
- 5She drills down into Recreation and Public Works departments to explore seasonal patterns
- 6She exports the analysis to the budget-management product for inclusion in budget recommendations
Outcome
One query, consistent definitions, and instant results across all departments — made possible by conformed dimensions and pre-aggregated materialized views.
View scenario
Data Steward
Quality Score Recovery
The Finance mart's quality score drops to 91% after a weekend migration. The data steward needs to investigate and restore compliance.
Steps
- 1Data steward receives a warehouse.quality.violation alert with the affected mart and rules
- 2She opens the quality scorecard and identifies three rules failing: GL account referential integrity, amount range checks, and date completeness
- 3She uses the data catalog to trace lineage back to the source — a bulk GL re-mapping in financial-core
- 4She quarantines affected records and coordinates with the finance team to correct the mapping
- 5Corrected data is re-processed through the ETL pipeline with quality rules passing
- 6The Finance mart quality score returns to 97%
Outcome
Quality violation detected automatically, root cause traced through lineage, remediated collaboratively, and score restored — all within the warehouse's governance framework.
View scenario
Finance Manager
Annual FIR Submission
The Finance Manager must submit the annual Financial Information Return to the Province of Ontario with accurate, auditable data.
Steps
- 1The warehouse's nightly ETL ensures the Financial mart is current through fiscal year-end
- 2The FIR regulatory dataset aggregates GL data into the province's required categories
- 3The Finance Manager opens the FIR template in reporting-analytics, pre-populated from the warehouse
- 4She reviews flagged variances — the warehouse highlights year-over-year changes exceeding 10%
- 5She validates data completeness using the quality scorecard (99.2% for Financial mart)
- 6The report is exported in the province's required format and submitted
Outcome
Data pre-validated, pre-aggregated, and auditable with full lineage — reducing FIR preparation from weeks to days.
View scenario
Senior Analyst
Self-Service Data Mart Creation
The Community Services department needs a custom data mart combining recreation, social services, and demographic data for a new equity analysis initiative.
Steps
- 1The senior analyst requests a new self-service mart through the data catalog interface
- 2She selects fact tables from Recreation and Community Services marts and conformed dimensions DIM_PERSON, DIM_GEOGRAPHY, and DIM_DATE
- 3The warehouse validates that all selected dimensions are conformed and quality-checked
- 4Access controls from security-iam are applied — only Community Services analysts can query the mart
- 5ETL pipelines are auto-generated to refresh the mart on the nightly schedule
- 6The new Equity Analysis mart is cataloged with business glossary entries and steward assignment
Outcome
A governed, quality-assured custom mart created in hours — not months — with full catalog integration, lineage, and access control.
View scenario
Internal Architecture
How it's built
4 architectural layers comprising 24 components — from API gateway to data quality engine.
4 layers · 24 total components
Data Warehouse
Every module owns a single bounded context, exposes stable APIs, and can be composed into any Civic product — that's the architecture that scales.
Krutik Parikh
Creator of Civic
Data Model
Entity Architecture
4 entities with 4 relationships — the authoritative schema for this bounded context.
Entities
Select an entity to explore its fields and relationships
API Surface
Integration Endpoints
9 RESTful endpoints across 5 resource groups — plus 4 domain events for async integration.
/api/v1/warehouse/pipelines
List all ETL pipelines with status and last run details
/api/v1/warehouse/pipelines
Create a new ETL pipeline definition
/api/v1/warehouse/pipelines/{id}/run
Trigger immediate pipeline execution
/api/v1/warehouse/pipelines/{id}/status
Get pipeline run status and row counts
Ecosystem
Products that depend on this module
7 Civic products consume Data Warehouse — making it one of the most critical platform services in the ecosystem.
Analytics & BI
Primary consumer — all BI dashboards source from warehouse data marts
View product →
Reporting & Analytics
Reads from warehouse marts for all reports and regulatory submissions
View product →
Open Data Portal
Publishes curated warehouse datasets as open data
View product →
AI Platform
ML training data sourced from warehouse dimensional models
View product →
Climate & ESG
GHG/energy data aggregated in warehouse for sustainability reporting
View product →
Budget Management
Budget vs. actuals sourced from the financial data mart
View product →
All 55 Product Specs
Every product's reporting/analytics section reads from its domain mart
View product →
Technical Specifications
Performance, Compliance & Configuration
ETL Pipeline SLA
Query Performance
Data Freshness (CDC)
Data Freshness (Batch)
Data Quality Score
Storage Efficiency
Historical Retention
Availability
FAQ
Frequently Asked Questions
Ready to Integrate
Build on Data Warehouse
Request an architecture brief, integration guide, or live demo environment for your team.