Fair Hiring?

Exploring how algorithms can have unequal impact

The Scenario

A growing tech company called NovaTech receives 100 job applications but can only interview some of them. To save time, they build a screening algorithm to score each applicant and decide who gets an interview.

N
NovaTech Inc.
Headquarters in the Greenfield district

The applicants come from two parts of the city:

Greenfield

  • NovaTech's HQ is located here
  • Several other tech companies nearby
  • Many current NovaTech employees live here

Lakeview

  • On the other side of the city
  • Fewer tech companies in the area
  • Few current NovaTech employees live here

The Algorithm

NovaTech's algorithm scores each applicant from 0 to 100 using four factors. It does not know which neighborhood anyone lives in — it only sees these four things:

💼
Work Experience
Up to 25 points
Years of relevant experience
📝
Skills Assessment
Up to 25 points
Score on a standardized test
🤝
Employee Referral
25 points if yes
Referred by a current employee
💻
Tech Internship
25 points if yes
Prior internship at a tech company
Key point: The algorithm uses the same formula for every applicant. It never asks about neighborhood, and it treats a Greenfield applicant and a Lakeview applicant with identical qualifications exactly the same way.

Meet the Applicants

Here are all 100 applicants and their qualifications — the raw information the algorithm will use to compute scores. Browse the data and see what you notice.

Filter: Showing 100 applicants
Name Neighborhood Experience (yrs) Skills Assessment Referral Tech Internship

The Scoring Formula

The algorithm converts each applicant's qualifications into a single score from 0 to 100:

Score = (Experience × 2.5) + (Assessment ÷ 4) + (Referral × 25) + (Internship × 25)

Applicants who score at or above the cutoff get an interview. Everyone below is screened out. Use the slider to decide where to set the cutoff.

Set the Interview Cutoff

50
out of 100 applicants will get interviews

Score Distribution (All Applicants)

Each bar shows how many applicants fall in that score range. The red line is your cutoff — everyone to the right gets an interview.

0102030405060708090100
Once you've chosen a cutoff, move on to the Investigate tab to see who the algorithm selected.

Who Did the Algorithm Select?

You set the cutoff at 50. The algorithm selected applicants for interviews. Let's see how those selections break down by neighborhood.

Greenfield Interview Rate
Lakeview Interview Rate

The algorithm uses the same formula for every applicant and never sees neighborhood. Try turning each factor on and off to see which ones affect the gap between neighborhoods.

💼 Work Experience (up to 25 pts)

Years of relevant work experience

📝 Skills Assessment (up to 25 pts)

Score on a standardized skills test

🤝 Employee Referral (25 pts)

Was referred by a current NovaTech employee

💻 Tech Internship (25 pts)

Had a prior internship at a tech company

Interview Rate with Current Feature Selection
Greenfield
Lakeview

Score Distribution by Neighborhood

Toggle features above to see how the distributions shift. Adjust the cutoff to explore different thresholds.

50
Greenfield
Lakeview
0102030405060708090100

What the Algorithm Can't See

Each applicant also has an actual job performance score — a measure of how well they would really do at the job if hired. The algorithm doesn't know this number. But we do.

Ready to see how well the algorithm's choices match actual ability?

Average Actual Job Performance

Greenfield
average performance
Lakeview
average performance
Performance is essentially the same. People from both neighborhoods are equally capable of doing the job well. The differences the algorithm found were about circumstances — where the company is located, who knows whom, which neighborhood has tech companies — not about ability.

The Algorithm's Mistakes

Using cutoff score of 50:

High performers from Lakeview who were REJECTED

Low performers from Greenfield who were ACCEPTED

Naming the Pattern: Proxy Variables

A proxy variable is a factor that, without anyone intending it to, effectively stands in for a different piece of information. In this simulation, two of the four factors are proxies for neighborhood — they are strongly correlated with where someone lives because of how the city is structured, even though the algorithm never asks about neighborhood directly.