Intersectional Bias in AI Hiring: A Technical Deep Dive

Beyond Single-Axis Analysis

Most bias audits examine one protected characteristic at a time: race, then gender, then age. This approach, while necessary, is fundamentally insufficient. Intersectional bias — discrimination that occurs at the intersection of multiple protected characteristics — can be entirely invisible to single-axis analysis.

This article provides a technical deep dive into how intersectional bias works, why it is especially prevalent in AI hiring tools, and the statistical methods required to detect and measure it.

How Intersectional Bias Hides in Plain Sight

Consider an AI resume screening tool with the following selection rates:

By Race:

White: 45%
Black: 40%
Hispanic: 42%
Asian: 44%

By Gender:

Male: 44%
Female: 41%

All impact ratios are above 0.80. A single-axis audit would report no adverse impact. But the intersectional picture tells a different story:

Intersectional Selection Rates:

White Male: 48%
White Female: 42%
Black Male: 46%
Black Female: 32%
Hispanic Male: 45%
Hispanic Female: 38%
Asian Male: 47%
Asian Female: 40%

Black women have a selection rate of 32% against a highest-group rate of 48%, producing an impact ratio of 0.67 — well below the 0.80 threshold. The tool discriminates specifically against Black women, but this is invisible when race and gender are analyzed separately.

Why AI Tools Produce Intersectional Bias

AI hiring tools are particularly prone to intersectional bias for several reasons:

Training Data Patterns

Historical hiring data often reflects compound discrimination. If an industry historically hired few Black women in senior roles, the AI learns that pattern as a feature, not a bug. The model encodes the intersection of race and gender as a signal, even without explicit demographic variables.

Proxy Variable Interaction

Individual proxy variables may have mild effects, but their interaction can be severe. For example, "name" is a weak proxy for gender, and "zip code" is a weak proxy for race. But the combination of name and zip code becomes a strong predictor of the intersection of race and gender, enabling the model to discriminate against specific intersectional groups.

Feature Space Geometry

In high-dimensional feature spaces, intersectional subgroups occupy distinct regions. A model that learns decision boundaries optimized for overall accuracy may draw those boundaries in ways that disproportionately exclude specific intersectional subgroups, even when aggregate performance across single axes appears balanced.

Statistical Methods for Intersectional Analysis

Cross-Tabulated Impact Ratios

The most straightforward approach: calculate selection rates for every combination of protected characteristics, then compute impact ratios relative to the highest-performing intersectional group.

For k racial groups and 2 gender categories, this produces 2k intersectional groups. With age bins, the number of subgroups increases further.

Challenge: Small subgroup sample sizes can produce unreliable ratios.

Fisher's Exact Test for Small Subgroups

When intersectional subgroup sizes are small (n < 40), chi-squared approximations become unreliable. Fisher's exact test calculates the exact probability of the observed distribution under the null hypothesis of no discrimination.

For intersectional analysis, Fisher's exact test is applied to each subgroup-vs-reference pair. Multiple comparison correction (e.g., Bonferroni or Benjamini-Hochberg) should be applied to control false discovery rate.

Log-Linear Models

Log-linear models can test for interaction effects between protected characteristics. A significant race-by-gender interaction term indicates that the effect of race on selection varies by gender (or vice versa), which is the statistical signature of intersectional bias.

Regression-Based Approaches

Logistic regression with interaction terms provides another lens:

The coefficient on the interaction term captures the additional effect of being in a specific intersectional group beyond the additive effects of each characteristic alone. A significant negative interaction term for Black and Female indicates intersectional discrimination against Black women specifically.

Minimum Sample Size Considerations

Intersectional analysis divides your data into many subgroups, which can create sample size challenges:

For chi-squared tests: Each cell should have an expected count of at least 5
For Fisher's exact test: Works with any sample size but has low power with very small samples
Practical minimum: At least 20 candidates per intersectional subgroup for meaningful analysis

When subgroup sizes are too small for reliable analysis, document the limitation and consider aggregating data across longer time periods.

Remediation Strategies

When intersectional bias is detected:

Identify contributing features: Use SHAP values or feature importance analysis to determine which input features drive the disparate outcomes for the affected intersectional group
Examine proxy interactions: Check whether combinations of features serve as proxies for the intersectional identity
Consider fairness constraints: Implement constraints that enforce minimum selection rates for intersectional subgroups, not just single-axis groups
Evaluate alternative models: Test whether different model architectures or feature sets reduce intersectional bias
Continuous monitoring: Intersectional bias can emerge or shift as applicant demographics change, so ongoing monitoring is essential

How OnHirely Performs Intersectional Analysis

OnHirely automatically generates intersectional subgroups from your demographic data, calculates impact ratios for each, selects the appropriate statistical test based on subgroup size (chi-squared for large samples, Fisher's exact for small), applies multiple comparison correction, and flags any intersectional group with statistically significant adverse impact. This analysis is included in the Pro plan and runs automatically with every audit.