Employee Flight Risk Intelligence System
CAN North Financial loses an estimated 252 employees a year. At $78,000 per
departure¹ that is $19.7 million leaving through the door — most of it invisible
until someone hands in their notice. The CEO’s question was simple: tell me who
is about to leave before they do. This is the system that answers it.
Demonstration project. All data is synthetically generated and
does not represent real employees or organizations. CAN North Financial is a
fictional company created for demonstration purposes. The model, methodology,
and analytical framework are genuine and production-ready.
Methodology and source code are proprietary.
¹ $78,000 replacement cost methodology: Calculated as 0.5× average
annual salary at CAN North Financial ($107,991), reflecting direct costs — recruiting
fees (15–20% of salary), onboarding, productivity loss during the 3–6 month ramp
period, and manager time. Consistent with SHRM’s published range of 50–200% of annual
salary for mid-level professional roles. Applied uniformly across all departures as a
conservative floor estimate.
Employees in dataset
1,400
3 years of longitudinal data per employee
Model recall (test set)
98%
49 of 50 real quitters caught on held-out data
At-risk cost identified
$19.7M
252 departures × $78,000 replacement cost
Models raced
5
Logistic Regression, RF, SVM, Tree, KNN
Python
scikit-learn
Logistic Regression
5-Fold Cross Validation
Streamlit
Longitudinal Data
Feature Engineering
StandardScaler
People Analytics
RTO Analysis
What makes this project different
Longitudinal design
Direction of change, not just snapshot
An employee at 50 engagement today who was at 80 two years ago is more at risk than someone who has always been at 50. The trend is the signal.
Novel feature
RTO Risk Index
Commute shock — not commute distance — drives attrition. An employee forced from remote to in-person with a 60-minute commute added is fundamentally different from one who always commuted 60 minutes.
Disciplined selection
Model race, not model guess
Five models. Same data. Same metric. 5-fold cross validation. The data picks the winner. Logistic Regression won at 96.6% recall — proving that feature engineering outperforms algorithm complexity.
The headline finding: Employees pushed more into office left at 23.6%.
Employees given flexibility left at 7.5%. A 3× gap driven by one policy decision.
The data said so before any model ran.
The full pipeline
Annual
Generate Data
generate_data.py
Annual
Train Model
train_pipeline.py
Quarterly
Score Employees
score_pipeline.py
Quarterly
Diana Acts
app.py
Section 01 — The Business Problem
Before any model — what is Diana trying to solve?
Every technical decision traces back to one business conversation.
Getting the problem framing right is more important than any algorithm choice.
Organization
CAN North Financial
1,400 employees · Toronto, Calgary, Vancouver
Pension · Wealth Management · Retail Banking
The problem in numbers
18%
Annual attrition rate
RTO finding — before any model
More office
23.6%
More flexible
7.5%
3× higher attrition from one policy decision.
Found in exploration — before any model ran.
CAN North Financial is losing people it cannot afford to lose
Diana is the Chief People Officer at CAN North Financial. Her company loses
roughly 18% of its workforce every year. Each departure costs an estimated
$78,000 in recruiting, onboarding, and lost productivity. Across 252 annual
departures that is $19.7 million leaving through the door — most of it
invisible until someone hands in their notice.
The CEO's ask was direct: "Tell me who is about to leave before they do. I want to call them."
The strategic framing: This is a classification problem — predict
quit (1) or stay (0). The metric that matters is not accuracy. It is recall.
Missing a quitter costs $78,000. A false alarm costs one manager conversation.
The model must be optimised to catch real quitters, not to look good on paper.
Why accuracy is the wrong metric
CAN North's dataset is 82% stayers and 18% leavers. A model that predicts
"stays" for every single person scores 82% accuracy — and catches zero real
quitters. This is the class imbalance trap. Every subsequent decision —
model choice, threshold setting, evaluation — flows from understanding this.
The right metric
Recall — catching real quitters is Diana's priority
Of all employees who actually left — what percentage did the model identify
in advance? A recall of 98% means the model caught 49 of 50 real quitters
on data it had never seen. One slipped through. Ten unnecessary conversations
were had. That is the trade-off Diana accepts.
Section 02 — The Data
Why longitudinal data — and why these specific features
The most important design decision was not which model to use. It was how to
structure the data. A snapshot tells you where someone stands today.
A longitudinal dataset tells you where they are going.
"An employee at 50 engagement today who was at 80 two years ago is more at risk
than someone who has always been at 50. The direction of travel is the signal."
Three years. One employee. 68 columns.
The dataset covers 1,400 CAN North employees across 2023, 2024, and 2025.
Each employee has one row with 68 columns covering engagement survey scores,
performance ratings, compensation benchmarks, commute data, manager changes,
and organizational disruption flags — all tracked year over year.
The target variable — left
— indicates whether this employee left by end of 2025. 252 left (18%).
1,148 stayed (82%).
Why commute shock — not commute distance:
An employee who always commuted 60 minutes has adapted. An employee whose
commute went from 0 to 60 minutes because of an RTO mandate has not.
The delta is the signal. This insight drove the design of the RTO Risk Index —
a novel composite feature not found in standard people analytics literature.
All scores on a 0–100 scale
A deliberate design choice: every survey score, every index, every composite
feature lives on the same 0–100 scale. When Diana presents to the board,
there is no unit conversion. An engagement score of 34 means the same
thing as an RTO risk score of 34.
── Dataset preview — first 3 employees ──────────
employee_id division role location
CNF0001 Wealth Management Analyst Calgary
CNF0002 Retail Banking Analyst Vancouver
CNF0003 Wealth Management Analyst Vancouver
eng_2023 eng_2024 eng_2025 trend
67.2 58.4 51.1 −16.1
81.3 75.6 75.1 −6.2
68.4 61.2 52.1 −16.3
rto_index persona_direction left
37.1 More office 0
25.3 No change 0
8.7 More flexible 0
──────────────────────────────────────────────────
Shape: 1,400 rows × 68 columns
Period: 2023 → 2025
Attrition: 18.0% (252 of 1,400)
Missing: 0 — clean dataset
Engagement survey — longitudinal (0–100)
engagement_2023/24/25
satisfaction_2023/24/25
career_growth_2023/24/25
mgr_effectiveness_2023/24/25
wellbeing_2023/24/25
RTO and commute — the differentiator
persona_direction
commute_time_change_min
rto_risk_index
transit_dependent
commute_km
Engineered trend features (calculated)
engagement_trend
satisfaction_trend
career_growth_trend
org_disruption_score
manager_stability
Section 03 — Data Exploration
What the data told us before any model ran
The most actionable insights came from exploration — not modelling. Before a single
line of model code, the data produced findings Diana could take to the CEO immediately.
30.5
Point satisfaction gap
Leavers: 28.4/100. Stayers: 58.8/100. Largest single-feature gap found.
3×
RTO attrition multiplier
More office: 23.6%. More flexible: 7.5%. One policy. Three times the loss.
−20.5
Point engagement drop
Leavers dropped 20.5 pts over 3 years. Stayers only 6.4. Trend beats snapshot.
12
Critical zone employees
Eng <40 + career <40 + RTO >60 = 100% historical attrition. Called immediately.
── Top signals — leavers vs stayers ─────────────
Signal Stayed Left Gap
Satisfaction 2025 58.8 28.4 +30.5
Engagement 2025 63.2 36.4 +26.8
Satisfaction trend (3yr) -9.9 -29.8 +19.9
Engagement trend (3yr) -6.3 -22.0 +15.7
Career growth 2025 55.2 42.8 +12.4
RTO risk index 27.9 44.4 -16.6
Org disruption score 30.7 43.2 -12.5
──────────────────────────────────────────────────
Salary vs market 49.5 47.0 +2.5
→ Salary barely registers. Not a pay problem.
Correlation with attrition (top 5):
engagement_2025 −0.558 Higher = STAY
satisfaction_2025 −0.548 Higher = STAY
engagement_trend −0.502 Higher = STAY
satisfaction_trend −0.400 Higher = STAY
rto_risk_index +0.352 Higher = LEAVE
The finding that changed the conversation
Salary vs market correlated with attrition at −0.07. Nearly nothing. This is
not a compensation problem. The data said so clearly before any model was built.
When Diana presented to the CEO, the first slide was not a model output. It was
the RTO chart: More office at 23.6%, More flexible at 7.5%. No statistical
literacy required. Just policy action.
The exploration finding that validated the entire design:
Engagement trend correlated at −0.502. Current engagement score correlated
at −0.558. A gap of only 0.056. The direction of change over three years is
almost as powerful as where someone stands today. This confirmed that building
longitudinal trend features was the right architectural choice.
What this means for Diana
Do not wait for the score to drop to 20. Act when it starts dropping.
An employee trending from 70 → 55 → 40 is more urgent than one who has
always been at 40. The trajectory is the early warning. The model knows this.
Section 04 — Feature Engineering
The work that happened before the model
Feature engineering is where data science actually lives. The 98% recall result
was not a function of algorithm choice. It was a function of what we fed the algorithm.
1
From 68 columns to 31 meaningful features
The raw dataset had 68 columns. Feeding all 68 creates noise — correlated
features confuse the algorithm and dilute the signal. We selected 31 features
based on two criteria: correlation with attrition above 0.15, and domain logic
about what actually drives someone to leave.
Key decision: We kept current-year snapshots and engineered
trends — but dropped the intermediate years. If we have engagement_2023 and
engagement_2025, we do not need engagement_2024. The trend captures the history.
Keeping all three gives correlated information three times over.
2
Engineering trend features — the longitudinal power
The most important step: calculating year-over-year trends. Not collected — calculated.
# Direction of change is more predictive than snapshot
engagement_trend = engagement_2025 - engagement_2023
satisfaction_trend = satisfaction_2025 - satisfaction_2023
career_trend = career_growth_2025 - career_growth_2023
absence_trend = absence_days_2025 - absence_days_2023
# engagement_trend of -32 means this person dropped
# 32 points over 3 years — the model reads freefall
3
Building the RTO Risk Index — the novel composite
No single column captured the full return-to-office impact. We combined three signals.
RTO Risk Index =
commute time added (0-100) × 0.40
persona change direction × 0.30
satisfaction drop since RTO × 0.30
Persona direction scores:
More office → 80 (highest risk)
No change → 20 (baseline)
More flexible → 10 (protective factor)
Final correlation with attrition: +0.352
4
Encoding and scaling — making features compete fairly
Categorical columns — division, role level, work persona direction — were
encoded as numbers. Then StandardScaler converted every feature to mean=0,
std=1. Without this, tenure (0–25) would overpower engagement (0–100) purely
because its numbers are larger.
Critical rule: The scaler is fitted on training data only —
then applied to test data using the same parameters. Fitting on test data
leaks future information into the scaling process and invalidates the evaluation.
This rule is non-negotiable.
Section 05 — Model Selection
Five models. Same data. The data picks the winner.
Model selection is a controlled experiment. Same training data in. Same evaluation
metric out. The only variable is the model. 5-fold cross validation removes luck
from the evaluation.
The model race
5-Fold Cross Validation · Primary metric: Recall · 1,120 training employees
Logistic Regression
96.6%
Winner
| Model | CV Recall | AUC | Std Dev |
| Logistic Regression ✓ |
96.6% |
0.997 |
0.033 |
| SVM |
91.1% |
0.991 |
0.039 |
| Random Forest |
82.2% |
0.976 |
0.046 |
| Decision Tree |
79.2% |
0.853 |
0.064 |
| KNN |
39.5% |
0.938 |
0.075 |
── Fold-by-fold recall (consistency check) ──────
Model F1 F2 F3 F4 F5
Logistic Regression 97.5% 100.0% 100.0% 92.7% 92.7%
SVM 95.0% 95.0% 92.5% 87.8% 85.4%
Random Forest 85.0% 87.5% 85.0% 75.6% 78.0%
Decision Tree 85.0% 85.0% 70.0% 73.2% 82.9%
KNN 35.0% 27.5% 45.0% 48.8% 41.5%
Low std = consistent = trustworthy for production use
Why the simplest model won
The most sophisticated model did not win. The simplest one did. Random Forest —
which builds 100 decision trees and takes a vote — came third at 82.2% recall.
Logistic Regression, which draws a single straight line, hit 96.6%.
This is the most important finding of the model race. It tells us that feature
engineering created clean, linearly separable signals. When features are well
built, a simple model beats a complex one. Complexity was unnecessary.
Why Random Forest underperformed
Random Forest's strength is finding complex non-linear patterns when features
are messy. But engagement trend, satisfaction trend, and the RTO risk index
are clean, powerful, linear signals. The forest added noise, not value.
Feature engineering was the higher-leverage decision.
Why KNN failed completely
KNN failed because of the curse of dimensionality. In 31-dimensional feature
space, every employee appears roughly equidistant from every other. Distance
calculations lose meaning. KNN needs very few features or very large data
to work in high dimensions. We had neither.
Consistency is reliability
Logistic Regression's fold scores: 97.5%, 100%, 100%, 92.7%, 92.7%
Standard deviation of 0.033 — lowest of all five models. Diana needs a model
that performs reliably every quarter, not one that is sometimes great and
sometimes mediocre. Consistency is reliability. Reliability is trust.
Section 06 — Model Results
What the model learned — and what it means for Diana
The held-out test set is the only honest number. 280 employees the model has never
seen. Training scores do not matter. Cross-validation averages do not matter.
This is the real-world result.
── Final test results ───────────────────────────
Winner: Logistic Regression
Test set: 280 held-out employees
Recall: 98.0%
Precision: 83.1%
F1 Score: 0.899
AUC: 0.999
Confusion Matrix:
Pred Stay Pred Quit
Actually Stayed 220 10
Actually Left 1 49
In plain English:
→ 49 quitters CAUGHT — Diana intervenes
→ 1 quitter MISSED — walked out
→ 10 false alarms — extra conversations
Business impact ($78k per departure):
→ Potential saves: $3,822,000
→ Missed cost: $78,000
→ False alarms: 10 conversations
Intervention ROI
$3.0M
Cost if Q2 at-risk leave
38×
Return on intervention
What the model learned — coefficients
Logistic Regression produces one coefficient per feature. Negative = higher
value pushes toward staying. Positive = higher value pushes toward leaving.
These are what Diana shows the CEO to explain the model.
org_disruption_score
+1.58
Green bars = push toward staying · Red bars = push toward leaving
Individual explanation — CNF0011:
Toronto · Pension Administration · Analyst. Score: 100%.
Engagement dropped from 59 to 15 over 3 years (−44 points).
RTO risk: 65/100. The engagement freefall alone contributed +8.6
to the model score. This person was not going to stay.
Scoring 200 current employees — Q2 2026
Critical
36
Immediate action
Stable
156
No action needed
# ID Division Role Loc Score Action
1 CNF1401 Pension Manager Calgary 100% Urgent 1:1 HR BP
2 CNF1408 Wealth Sr Anlst Calgary 100% Flexibility + career
3 CNF1413 Wealth Director Toronto 100% Flexibility + career
4 CNF1460 Wealth Manager Toronto 100% Comp review
5 CNF1446 Wealth VP Vanc. 100% Flexibility + career
39 at risk · Est. cost if all leave: $3,042,000
Section 07 — Deployment
From model to tool Diana actually uses
A model living in a notebook is not a product. A tool Diana opens on Monday morning,
uploads her quarterly data, and hands a ranked list to HR business partners — that is a product.
The production pipeline
Three scripts. Three jobs. The data scientist's work is entirely separate from
the HR team's work. Diana never sees the training code. The data scientist never
touches Diana's quarterly upload.
Annual
Generate Data
generate_data.py
Annual
Train Model
train_pipeline.py
Quarterly
Score Employees
score_pipeline.py
Quarterly
Diana Acts
app.py
Mode 1 — Annual
Training
Data scientist runs train_pipeline.py on confirmed historical data.
Three files saved: model.pkl, scaler.pkl, features.pkl. Model is frozen.
Mode 2 — Quarterly
Scoring
Diana's team uploads fresh employee data. Frozen model scores everyone.
Returns the ranked list. No retraining. No data scientist needed.
Data privacy and proprietary methodology
This demonstration runs on synthetic data only. The training pipeline, feature
engineering logic, and source code are not publicly available. The live demo
shows the methodology in action. Deployment for a real organization requires a
separate engagement to retrain on actual HRIS data within the client's
environment. Your data never leaves your infrastructure.
Try the demo: Download the sample dataset from the app.
Open it in Excel. Change engagement scores, RTO risk, or satisfaction trends.
Upload your modified version and watch the predictions change in real time.
The model is responding to your inputs.
can-north-flight-risk.streamlit.app
🏦 CAN North Financial
Employee Flight Risk Intelligence System
DEMONSTRATION VERSION — Synthetic data.
Download the sample dataset, tweak values in Excel,
upload and see predictions change in real time.
✓ Model loaded — ready to score employees
39 employees at risk —
estimated cost if all leave: $3,042,000
↗ Open Live Demo