The Scenario
A growing tech company called NovaTech receives 100 job applications but can only interview some of them. To save time, they build a screening algorithm to score each applicant and decide who gets an interview.
Headquarters in the Greenfield district
The applicants come from two parts of the city:
Greenfield
- NovaTech's HQ is located here
- Several other tech companies nearby
- Many current NovaTech employees live here
Lakeview
- On the other side of the city
- Fewer tech companies in the area
- Few current NovaTech employees live here
The Algorithm
NovaTech's algorithm scores each applicant from 0 to 100 using four factors. It does not know which neighborhood anyone lives in — it only sees these four things:
Years of relevant experience
Score on a standardized test
Referred by a current employee
Prior internship at a tech company
Meet the Applicants
Here are all 100 applicants and their qualifications — the raw information the algorithm will use to compute scores. Browse the data and see what you notice.
| Name | Neighborhood | Experience (yrs) | Skills Assessment | Referral | Tech Internship |
|---|
The Scoring Formula
The algorithm converts each applicant's qualifications into a single score from 0 to 100:
Applicants who score at or above the cutoff get an interview. Everyone below is screened out. Use the slider to decide where to set the cutoff.
Set the Interview Cutoff
Score Distribution (All Applicants)
Each bar shows how many applicants fall in that score range. The red line is your cutoff — everyone to the right gets an interview.
Who Did the Algorithm Select?
You set the cutoff at 50. The algorithm selected — applicants for interviews. Let's see how those selections break down by neighborhood.
The algorithm uses the same formula for every applicant and never sees neighborhood. Try turning each factor on and off to see which ones affect the gap between neighborhoods.
💼 Work Experience (up to 25 pts)
Years of relevant work experience
📝 Skills Assessment (up to 25 pts)
Score on a standardized skills test
🤝 Employee Referral (25 pts)
Was referred by a current NovaTech employee
💻 Tech Internship (25 pts)
Had a prior internship at a tech company
Score Distribution by Neighborhood
Toggle features above to see how the distributions shift. Adjust the cutoff to explore different thresholds.
What the Algorithm Can't See
Each applicant also has an actual job performance score — a measure of how well they would really do at the job if hired. The algorithm doesn't know this number. But we do.
Ready to see how well the algorithm's choices match actual ability?
Average Actual Job Performance
The Algorithm's Mistakes
Using cutoff score of 50:
High performers from Lakeview who were REJECTED
Low performers from Greenfield who were ACCEPTED
Naming the Pattern: Proxy Variables
A proxy variable is a factor that, without anyone intending it to, effectively stands in for a different piece of information. In this simulation, two of the four factors are proxies for neighborhood — they are strongly correlated with where someone lives because of how the city is structured, even though the algorithm never asks about neighborhood directly.
Work Experience and Skills Assessment are distributed similarly across both neighborhoods. They measure relevant qualifications without favoring one group.
Employee Referral — current employees mostly live in Greenfield and refer people they know.
Tech Internship — tech companies cluster in Greenfield, making internships far more accessible to Greenfield residents.
These two factors don't measure job ability. They measure access to professional networks and proximity to the tech industry, which differ between neighborhoods for structural reasons. But the algorithm treats them as qualifications.