Employee Flight Risk Intelligence System
CAN North Financial loses an estimated 252 employees a year. At $78,000 per departure¹ that is $19.7 million leaving through the door — most of it invisible until someone hands in their notice. The CEO’s question was simple: tell me who is about to leave before they do. This is the system that answers it.
Demonstration project. All data is synthetically generated and does not represent real employees or organizations. CAN North Financial is a fictional company created for demonstration purposes. The model, methodology, and analytical framework are genuine and production-ready. Methodology and source code are proprietary.
¹ $78,000 replacement cost methodology: Calculated as 0.5× average annual salary at CAN North Financial ($107,991), reflecting direct costs — recruiting fees (15–20% of salary), onboarding, productivity loss during the 3–6 month ramp period, and manager time. Consistent with SHRM’s published range of 50–200% of annual salary for mid-level professional roles. Applied uniformly across all departures as a conservative floor estimate.
Employees in dataset
1,400
3 years of longitudinal data per employee
Model recall (test set)
98%
49 of 50 real quitters caught on held-out data
At-risk cost identified
$19.7M
252 departures × $78,000 replacement cost
Models raced
5
Logistic Regression, RF, SVM, Tree, KNN
Python scikit-learn Logistic Regression 5-Fold Cross Validation Streamlit Longitudinal Data Feature Engineering StandardScaler People Analytics RTO Analysis

What makes this project different

Longitudinal design

Direction of change, not just snapshot

An employee at 50 engagement today who was at 80 two years ago is more at risk than someone who has always been at 50. The trend is the signal.

Novel feature

RTO Risk Index

Commute shock — not commute distance — drives attrition. An employee forced from remote to in-person with a 60-minute commute added is fundamentally different from one who always commuted 60 minutes.

Disciplined selection

Model race, not model guess

Five models. Same data. Same metric. 5-fold cross validation. The data picks the winner. Logistic Regression won at 96.6% recall — proving that feature engineering outperforms algorithm complexity.

The headline finding: Employees pushed more into office left at 23.6%. Employees given flexibility left at 7.5%. A 3× gap driven by one policy decision. The data said so before any model ran.

The full pipeline

Annual
Generate Data
generate_data.py
Annual
Train Model
train_pipeline.py
Quarterly
Score Employees
score_pipeline.py
Quarterly
Diana Acts
app.py
↗ Open Live Demo
Before any model — what is Diana trying to solve?
Every technical decision traces back to one business conversation. Getting the problem framing right is more important than any algorithm choice.
Organization

CAN North Financial

1,400 employees · Toronto, Calgary, Vancouver
Pension · Wealth Management · Retail Banking

The problem in numbers
18%
Annual attrition rate
252
Departures per year
$78K
Cost per departure
$19.7M
Annual cost
RTO finding — before any model
More office 23.6%
More flexible 7.5%

3× higher attrition from one policy decision. Found in exploration — before any model ran.

CAN North Financial is losing people it cannot afford to lose

Diana is the Chief People Officer at CAN North Financial. Her company loses roughly 18% of its workforce every year. Each departure costs an estimated $78,000 in recruiting, onboarding, and lost productivity. Across 252 annual departures that is $19.7 million leaving through the door — most of it invisible until someone hands in their notice.

The CEO's ask was direct: "Tell me who is about to leave before they do. I want to call them."

The strategic framing: This is a classification problem — predict quit (1) or stay (0). The metric that matters is not accuracy. It is recall. Missing a quitter costs $78,000. A false alarm costs one manager conversation. The model must be optimised to catch real quitters, not to look good on paper.

Why accuracy is the wrong metric

CAN North's dataset is 82% stayers and 18% leavers. A model that predicts "stays" for every single person scores 82% accuracy — and catches zero real quitters. This is the class imbalance trap. Every subsequent decision — model choice, threshold setting, evaluation — flows from understanding this.

The right metric

Recall — catching real quitters is Diana's priority

Of all employees who actually left — what percentage did the model identify in advance? A recall of 98% means the model caught 49 of 50 real quitters on data it had never seen. One slipped through. Ten unnecessary conversations were had. That is the trade-off Diana accepts.

Why longitudinal data — and why these specific features
The most important design decision was not which model to use. It was how to structure the data. A snapshot tells you where someone stands today. A longitudinal dataset tells you where they are going.
"An employee at 50 engagement today who was at 80 two years ago is more at risk than someone who has always been at 50. The direction of travel is the signal."

Three years. One employee. 68 columns.

The dataset covers 1,400 CAN North employees across 2023, 2024, and 2025. Each employee has one row with 68 columns covering engagement survey scores, performance ratings, compensation benchmarks, commute data, manager changes, and organizational disruption flags — all tracked year over year.

The target variable — left — indicates whether this employee left by end of 2025. 252 left (18%). 1,148 stayed (82%).

Why commute shock — not commute distance: An employee who always commuted 60 minutes has adapted. An employee whose commute went from 0 to 60 minutes because of an RTO mandate has not. The delta is the signal. This insight drove the design of the RTO Risk Index — a novel composite feature not found in standard people analytics literature.

All scores on a 0–100 scale

A deliberate design choice: every survey score, every index, every composite feature lives on the same 0–100 scale. When Diana presents to the board, there is no unit conversion. An engagement score of 34 means the same thing as an RTO risk score of 34.

── Dataset preview — first 3 employees ────────── employee_id division role location CNF0001 Wealth Management Analyst Calgary CNF0002 Retail Banking Analyst Vancouver CNF0003 Wealth Management Analyst Vancouver eng_2023 eng_2024 eng_2025 trend 67.2 58.4 51.1 −16.1 81.3 75.6 75.1 −6.2 68.4 61.2 52.1 −16.3 rto_index persona_direction left 37.1 More office 0 25.3 No change 0 8.7 More flexible 0 ────────────────────────────────────────────────── Shape: 1,400 rows × 68 columns Period: 2023 → 2025 Attrition: 18.0% (252 of 1,400) Missing: 0 — clean dataset
Engagement survey — longitudinal (0–100)
engagement_2023/24/25 satisfaction_2023/24/25 career_growth_2023/24/25 mgr_effectiveness_2023/24/25 wellbeing_2023/24/25
RTO and commute — the differentiator
persona_direction commute_time_change_min rto_risk_index transit_dependent commute_km
Engineered trend features (calculated)
engagement_trend satisfaction_trend career_growth_trend org_disruption_score manager_stability
What the data told us before any model ran
The most actionable insights came from exploration — not modelling. Before a single line of model code, the data produced findings Diana could take to the CEO immediately.
30.5
Point satisfaction gap
Leavers: 28.4/100. Stayers: 58.8/100. Largest single-feature gap found.
RTO attrition multiplier
More office: 23.6%. More flexible: 7.5%. One policy. Three times the loss.
−20.5
Point engagement drop
Leavers dropped 20.5 pts over 3 years. Stayers only 6.4. Trend beats snapshot.
12
Critical zone employees
Eng <40 + career <40 + RTO >60 = 100% historical attrition. Called immediately.
── Top signals — leavers vs stayers ───────────── Signal Stayed Left Gap Satisfaction 2025 58.8 28.4 +30.5 Engagement 2025 63.2 36.4 +26.8 Satisfaction trend (3yr) -9.9 -29.8 +19.9 Engagement trend (3yr) -6.3 -22.0 +15.7 Career growth 2025 55.2 42.8 +12.4 RTO risk index 27.9 44.4 -16.6 Org disruption score 30.7 43.2 -12.5 ────────────────────────────────────────────────── Salary vs market 49.5 47.0 +2.5 → Salary barely registers. Not a pay problem. Correlation with attrition (top 5): engagement_2025 −0.558 Higher = STAY satisfaction_2025 −0.548 Higher = STAY engagement_trend −0.502 Higher = STAY satisfaction_trend −0.400 Higher = STAY rto_risk_index +0.352 Higher = LEAVE

The finding that changed the conversation

Salary vs market correlated with attrition at −0.07. Nearly nothing. This is not a compensation problem. The data said so clearly before any model was built.

When Diana presented to the CEO, the first slide was not a model output. It was the RTO chart: More office at 23.6%, More flexible at 7.5%. No statistical literacy required. Just policy action.

The exploration finding that validated the entire design: Engagement trend correlated at −0.502. Current engagement score correlated at −0.558. A gap of only 0.056. The direction of change over three years is almost as powerful as where someone stands today. This confirmed that building longitudinal trend features was the right architectural choice.
What this means for Diana

Do not wait for the score to drop to 20. Act when it starts dropping.

An employee trending from 70 → 55 → 40 is more urgent than one who has always been at 40. The trajectory is the early warning. The model knows this.

The work that happened before the model
Feature engineering is where data science actually lives. The 98% recall result was not a function of algorithm choice. It was a function of what we fed the algorithm.
1

From 68 columns to 31 meaningful features

The raw dataset had 68 columns. Feeding all 68 creates noise — correlated features confuse the algorithm and dilute the signal. We selected 31 features based on two criteria: correlation with attrition above 0.15, and domain logic about what actually drives someone to leave.

Key decision: We kept current-year snapshots and engineered trends — but dropped the intermediate years. If we have engagement_2023 and engagement_2025, we do not need engagement_2024. The trend captures the history. Keeping all three gives correlated information three times over.
2

Engineering trend features — the longitudinal power

The most important step: calculating year-over-year trends. Not collected — calculated.

# Direction of change is more predictive than snapshot engagement_trend = engagement_2025 - engagement_2023 satisfaction_trend = satisfaction_2025 - satisfaction_2023 career_trend = career_growth_2025 - career_growth_2023 absence_trend = absence_days_2025 - absence_days_2023 # engagement_trend of -32 means this person dropped # 32 points over 3 years — the model reads freefall
3

Building the RTO Risk Index — the novel composite

No single column captured the full return-to-office impact. We combined three signals.

RTO Risk Index = commute time added (0-100) × 0.40 persona change direction × 0.30 satisfaction drop since RTO × 0.30 Persona direction scores: More office → 80 (highest risk) No change → 20 (baseline) More flexible → 10 (protective factor) Final correlation with attrition: +0.352
4

Encoding and scaling — making features compete fairly

Categorical columns — division, role level, work persona direction — were encoded as numbers. Then StandardScaler converted every feature to mean=0, std=1. Without this, tenure (0–25) would overpower engagement (0–100) purely because its numbers are larger.

Critical rule: The scaler is fitted on training data only — then applied to test data using the same parameters. Fitting on test data leaks future information into the scaling process and invalidates the evaluation. This rule is non-negotiable.
Five models. Same data. The data picks the winner.
Model selection is a controlled experiment. Same training data in. Same evaluation metric out. The only variable is the model. 5-fold cross validation removes luck from the evaluation.

The model race

5-Fold Cross Validation · Primary metric: Recall · 1,120 training employees

Logistic Regression
96.6%
Winner
SVM
91.1%
Random Forest
82.2%
Decision Tree
79.2%
KNN
39.5%
ModelCV RecallAUCStd Dev
Logistic Regression ✓ 96.6% 0.997 0.033
SVM 91.1% 0.991 0.039
Random Forest 82.2% 0.976 0.046
Decision Tree 79.2% 0.853 0.064
KNN 39.5% 0.938 0.075
── Fold-by-fold recall (consistency check) ────── Model F1 F2 F3 F4 F5 Logistic Regression 97.5% 100.0% 100.0% 92.7% 92.7% SVM 95.0% 95.0% 92.5% 87.8% 85.4% Random Forest 85.0% 87.5% 85.0% 75.6% 78.0% Decision Tree 85.0% 85.0% 70.0% 73.2% 82.9% KNN 35.0% 27.5% 45.0% 48.8% 41.5% Low std = consistent = trustworthy for production use

Why the simplest model won

The most sophisticated model did not win. The simplest one did. Random Forest — which builds 100 decision trees and takes a vote — came third at 82.2% recall. Logistic Regression, which draws a single straight line, hit 96.6%.

This is the most important finding of the model race. It tells us that feature engineering created clean, linearly separable signals. When features are well built, a simple model beats a complex one. Complexity was unnecessary.

Why Random Forest underperformed

Random Forest's strength is finding complex non-linear patterns when features are messy. But engagement trend, satisfaction trend, and the RTO risk index are clean, powerful, linear signals. The forest added noise, not value. Feature engineering was the higher-leverage decision.

Why KNN failed completely

KNN failed because of the curse of dimensionality. In 31-dimensional feature space, every employee appears roughly equidistant from every other. Distance calculations lose meaning. KNN needs very few features or very large data to work in high dimensions. We had neither.

Consistency is reliability

Logistic Regression's fold scores: 97.5%, 100%, 100%, 92.7%, 92.7%

Standard deviation of 0.033 — lowest of all five models. Diana needs a model that performs reliably every quarter, not one that is sometimes great and sometimes mediocre. Consistency is reliability. Reliability is trust.

What the model learned — and what it means for Diana
The held-out test set is the only honest number. 280 employees the model has never seen. Training scores do not matter. Cross-validation averages do not matter. This is the real-world result.
── Final test results ─────────────────────────── Winner: Logistic Regression Test set: 280 held-out employees Recall: 98.0% Precision: 83.1% F1 Score: 0.899 AUC: 0.999 Confusion Matrix: Pred Stay Pred Quit Actually Stayed 220 10 Actually Left 1 49 In plain English:49 quitters CAUGHT — Diana intervenes → 1 quitter MISSED — walked out → 10 false alarms — extra conversations Business impact ($78k per departure): → Potential saves: $3,822,000 → Missed cost: $78,000 → False alarms: 10 conversations
Intervention ROI
$3.0M
Cost if Q2 at-risk leave
38×
Return on intervention

What the model learned — coefficients

Logistic Regression produces one coefficient per feature. Negative = higher value pushes toward staying. Positive = higher value pushes toward leaving. These are what Diana shows the CEO to explain the model.

satisfaction_2025
−4.00
engagement_2025
−3.68
career_growth_2025
−3.35
engagement_trend
−1.68
org_disruption_score
+1.58
rto_risk_index
+1.00

Green bars = push toward staying · Red bars = push toward leaving

Individual explanation — CNF0011: Toronto · Pension Administration · Analyst. Score: 100%. Engagement dropped from 59 to 15 over 3 years (−44 points). RTO risk: 65/100. The engagement freefall alone contributed +8.6 to the model score. This person was not going to stay.

Scoring 200 current employees — Q2 2026

Critical
36
Immediate action
High
3
This quarter
Medium
5
Monitor monthly
Stable
156
No action needed
# ID Division Role Loc Score Action 1 CNF1401 Pension Manager Calgary 100% Urgent 1:1 HR BP 2 CNF1408 Wealth Sr Anlst Calgary 100% Flexibility + career 3 CNF1413 Wealth Director Toronto 100% Flexibility + career 4 CNF1460 Wealth Manager Toronto 100% Comp review 5 CNF1446 Wealth VP Vanc. 100% Flexibility + career 39 at risk · Est. cost if all leave: $3,042,000
From model to tool Diana actually uses
A model living in a notebook is not a product. A tool Diana opens on Monday morning, uploads her quarterly data, and hands a ranked list to HR business partners — that is a product.

The production pipeline

Three scripts. Three jobs. The data scientist's work is entirely separate from the HR team's work. Diana never sees the training code. The data scientist never touches Diana's quarterly upload.

Annual
Generate Data
generate_data.py
Annual
Train Model
train_pipeline.py
Quarterly
Score Employees
score_pipeline.py
Quarterly
Diana Acts
app.py
Mode 1 — Annual

Training

Data scientist runs train_pipeline.py on confirmed historical data. Three files saved: model.pkl, scaler.pkl, features.pkl. Model is frozen.

Mode 2 — Quarterly

Scoring

Diana's team uploads fresh employee data. Frozen model scores everyone. Returns the ranked list. No retraining. No data scientist needed.

Data privacy and proprietary methodology

This demonstration runs on synthetic data only. The training pipeline, feature engineering logic, and source code are not publicly available. The live demo shows the methodology in action. Deployment for a real organization requires a separate engagement to retrain on actual HRIS data within the client's environment. Your data never leaves your infrastructure.

Try the demo: Download the sample dataset from the app. Open it in Excel. Change engagement scores, RTO risk, or satisfaction trends. Upload your modified version and watch the predictions change in real time. The model is responding to your inputs.
↗ Open Live Demo
can-north-flight-risk.streamlit.app
🏦 CAN North Financial
Employee Flight Risk Intelligence System
DEMONSTRATION VERSION — Synthetic data. Download the sample dataset, tweak values in Excel, upload and see predictions change in real time.
✓ Model loaded — ready to score employees
36
Critical
3
High
5
Medium
156
Stable
39 employees at risk — estimated cost if all leave: $3,042,000
↗ Open Live Demo