Big Five Personality Tests for Hiring: The Compliance Guide

How to use the Big Five in hiring without EEOC adverse-impact risk: the validity evidence, what is legal in the US and EU, and a defensibility checklist.

Ricardo de Jong · Co-founder, Team Building Bot 22 May 2026 13 min read

A Big Five OCEAN radar chart balanced on a set of justice scales, in the Team Building Bot house style

You can use a Big Five personality assessment in hiring, and from an adverse-impact standpoint it is one of the safer instruments on the table. That is the part most compliance-nervous HR teams have backwards. The legal risk in personality testing is real, but it sits in how a test is deployed, not in the Big Five itself.

This post is the practical version of that argument. It covers what the validity evidence actually supports, where the legal lines sit in the US and the EU, and a checklist you can hold a vendor to. It is written for HR, talent acquisition, and L&D buyers rather than lawyers, and it is general guidance, not legal advice. For the underlying model, our Big Five pillar guide walks through the five traits in depth.

The short answer

A Big Five assessment is legally usable in pre-employment selection in both the US and the EU. Used well, it clears the bar that sinks most other selection tools: it barely moves the needle on adverse impact, because Big Five trait scores show almost no average gap between demographic groups.

The risk is not the instrument. It is the deployment. A personality test becomes a liability when it is the sole automated reason a candidate is rejected, when it is applied as a generic profile across unrelated roles, when it strays into clinical territory, or when it is bolted onto an opaque AI screening system with no human review. Each of those is avoidable.

Practice	Defensible
Conscientiousness as one input among several	Yes
A personality score as an automated, sole disqualifier	No
Job-relevant traits backed by a validation study	Yes
One generic personality profile applied to every role	No
A normal-range Big Five instrument	Yes
A clinical instrument used before a job offer	No
Continuous adverse-impact monitoring of your funnel	Yes
Adjusting scores by demographic group to mask impact	No

The rest of this post is the evidence and the legal reasoning behind that table.

A schematic comparison of a job applicant funnel running through a personality assessment, with two paths labelled defensible and not defensible, in the Team Building Bot house style

Why the Big Five is the lower-risk choice for adverse impact

Adverse impact is the heart of US selection law, and it is measured by subgroup differences. The standard metric is Cohen’s d, the standardised gap between two groups’ average scores. A d of 1.0 means a full standard deviation between group means.

This is where the Big Five has a quiet, decisive advantage. Cognitive ability tests are strong predictors of job performance, but they carry a large Black-White subgroup difference, with estimates historically clustering near d = 0.8. A selection process that leans heavily on cognitive testing in a top-down way almost guarantees adverse impact, and then forces the employer to defend the test as an absolute business necessity.

The Big Five behaves differently. Conscientiousness, the trait most tied to job performance, shows a subgroup difference close to zero. Updated meta-analytic work puts the Black-White difference for conscientiousness at roughly d = -0.07, which in plain terms means the groups score the same. The other four traits sit in the same negligible range. A selection input that predicts performance and produces almost no group gap is rare, and it is exactly what makes the Big Five attractive to a compliance team.

For years the field treated this as a trade-off: weight cognitive tests to maximise prediction and accept the diversity cost, or weight personality to protect diversity and accept weaker prediction. Recent reanalysis of the selection-method evidence has softened that trade-off considerably. Once cognitive ability’s validity is corrected downward, a battery anchored by structured interviews and conscientiousness can reach near-optimal prediction while staying well clear of adverse-impact thresholds. The Big Five is no longer the consolation prize in that equation.

What conscientiousness actually predicts

The credibility of the Big Five in hiring rests mostly on one trait. Conscientiousness, the tendency toward organisation, follow-through, and goal-directed self-discipline, is the Big Five dimension with the broadest evidence base in work psychology.

Barrick and Mount (1991), in Personnel Psychology, established conscientiousness as the one Big Five trait that predicted job performance across virtually every occupational group they examined. That was the finding that moved personality testing back into mainstream selection after decades of scepticism. Hurtz and Donovan (2000), revisiting the question in the Journal of Applied Psychology with criteria built specifically around the Big Five, reported the same pattern: conscientiousness was the most consistent predictor, while the other four traits predicted more narrowly and more conditionally.

The headline numbers have moved over time, and it is worth being honest about that. The Sackett, Zhang, Berry and Lievens (2022) reanalysis in the Journal of Applied Psychology recalculated the operational validity of common selection methods and argued that several had been overestimated, mostly because of how earlier work corrected for range restriction. In that reanalysis structured interviews came out near the top, cognitive ability tests were revised sharply downward, and conscientiousness on its own sat lower than older estimates suggested. Exact coefficients are still debated in the literature, so treat any single number with caution.

What survives that debate is the part that matters for a buyer. Conscientiousness is rarely meant to be used alone. Because personality and cognitive ability capture different mechanisms and barely correlate, they reinforce each other in a combined model. Conscientiousness adds genuine incremental validity on top of cognitive ability and on top of a structured interview, and it does so without adding adverse impact. That combination, real predictive lift at near-zero subgroup cost, is the defensible reason to put it in a selection process at all.

The US legal framework

Title VII of the Civil Rights Act of 1964 makes it unlawful to use employment practices that discriminate on the basis of race, colour, religion, sex, or national origin. It recognises two kinds of discrimination. Disparate treatment is intentional. Disparate impact is a facially neutral practice, such as a test, that disproportionately harms a protected group without a justified business necessity. Personality testing lives almost entirely in the disparate-impact world.

Disparate impact is policed through the EEOC’s Uniform Guidelines on Employee Selection Procedures, in force since 1978. The Guidelines use the four-fifths rule as a screening threshold: if the selection rate for a protected group falls below 80 percent of the rate for the highest-scoring group, adverse impact is presumed. Once that happens, the burden shifts to the employer to show the test is job-related and consistent with business necessity, which in practice means a validation study tying the assessment to the actual work.

The Big Five sits in a comfortable position here. Because conscientiousness and the other traits produce negligible subgroup differences, they rarely trip the four-fifths rule in the first place, which means the employer often never reaches the burden-shifting stage. That statistical reality is the single biggest legal argument for the Big Five over cognitive testing. One rule has no exceptions, though: never adjust scores or cut-offs by demographic group to manufacture a clean result. Section 106 of the Civil Rights Act of 1991 explicitly banned within-group norming. The raw scores have to be used the same way for everyone.

The ADA line: normal traits, not clinical ones

The Americans with Disabilities Act draws a second line, and this is the one personality tests most often cross. The ADA prohibits employers from running medical examinations or making medical inquiries before a conditional job offer.

The boundary between a lawful personality test and an unlawful medical exam was set by the Seventh Circuit in Karraker v. Rent-A-Center (411 F.3d 831, 2005). The employer had used the Minnesota Multiphasic Personality Inventory, an instrument that includes scales for clinical conditions such as depression and paranoia, to screen candidates for promotion. The court held that because the MMPI is designed in part to reveal mental illness, it counts as a medical examination, which made its pre-offer use illegal.

The lesson is clean. A Big Five assessment is built to measure variation in normal personality, behavioural tendencies, preferences, and work style, not psychopathology. As long as it cannot be used to diagnose a mental health condition, it is treated as a standard behavioural test and is lawful before an offer. Clinical instruments are not, so they have no place in routine applicant screening. Employers also have to provide reasonable accommodations where the format of a test disadvantages a candidate with a disclosed disability.

When the test is run by an algorithm

As assessments move from questionnaires to algorithmic and AI-driven scoring, the question of who is liable has widened. The ongoing Mobley v. Workday litigation tests whether an AI screening vendor can be treated as an agent of the employer and sued directly under federal anti-discrimination law. Courts have allowed parts of that theory to proceed, and the case has not reached a final ruling.

The practical takeaway does not depend on the outcome. The EEOC’s position is already that an employer cannot delegate its anti-discrimination duty to a software vendor. If an AI-scored personality test produces disparate impact, the employer using it is on the hook. Outsourcing the screening does not outsource the liability.

A two-panel schematic contrasting the US and the EU regulatory approach to personality testing in hiring, in the Team Building Bot house style

The EU and UK framework

A multinational employer cannot just port a US process across the Atlantic. Europe regulates personality data through privacy and algorithmic-transparency law, which is a different philosophy from the US litigation model.

The EU General Data Protection Regulation governs how psychometric data is processed. Article 22 gives candidates the right not to be subject to a decision based solely on automated processing that significantly affects them, and recruitment clearly qualifies. Employers used to sidestep this with a token human in the loop. The European Court of Justice’s 2023 SCHUFA ruling narrowed that escape route: if an automated score is generated and a human then defers to it without a real, substantive review, the process still counts as an automated decision. For a Big Five assessment in the EU, that means the score has to function as advisory input inside a genuine human review, not as a threshold a recruiter rubber-stamps.

The EU AI Act, adopted in 2024, classifies AI systems used to recruit, screen, filter applications, or evaluate candidates as high-risk under Annex III. Vendors and deployers of AI-driven assessments face obligations before the tool can be used: technical documentation, data-governance controls to keep training data free of historical bias, transparent human-oversight design, and formal conformity assessment. If you are buying an AI-scored personality tool for use in Europe, those obligations are part of the procurement conversation.

The UK kept the Equality Act 2010 after Brexit. It mirrors Title VII but uses the language of direct and indirect discrimination. Indirect discrimination is the relevant risk here: a rule such as a rigid personality cut-off score that puts a protected group at a disadvantage, where the employer cannot show it is a proportionate means of achieving a legitimate aim. That proportionality test is the UK cousin of the US business-necessity defence.

The sharpest UK enforcement area is neurodiversity. In Government Legal Service v Brooks (2017), a candidate with Asperger’s syndrome failed a multiple-choice situational judgement test. The Employment Appeal Tribunal held that the rigid format placed her at a substantial disadvantage linked to her disability, and that the employer’s refusal to offer an alternative format was discriminatory. Updated guidance from the British Psychological Society pushes test users in the same direction: standardised administration has to be balanced against equitable conditions for neurodivergent candidates, and a test should measure the job-relevant competency rather than the disability. A Big Five process that screens out a neurodivergent candidate on a trait profile unrelated to the actual role, with no accommodation offered, is exposed under the Equality Act.

The 2024 to 2026 regulatory wave

The last two years have added a layer of state and local rules in the US, aimed squarely at automated hiring tools. Three are worth knowing.

New York City’s Local Law 144 requires employers using an automated employment decision tool to commission an independent bias audit each year, publish a summary of the results, and tell candidates in advance that an automated tool is in use, with a route to request an alternative.

California’s Civil Rights Council finalised regulations on automated decision systems under its Fair Employment and Housing Act, which took effect in 2025. They prohibit using automated systems, including algorithmically scored assessments, in a way that discriminates, and they require employers to retain system records and run ongoing anti-bias testing so a defence is actually available if a claim lands.

Illinois has moved on AI in video interviews and, under recent amendments, requires employers to notify candidates when AI is used in hiring decisions and bars the use of demographic proxies such as ZIP codes to reject candidates automatically.

The exact effective dates and obligations shift, and a few of these rules were still settling as of mid-2026, so confirm the current text with counsel before you rely on any single date. The direction of travel is not in doubt: notice, auditing, and human oversight are becoming baseline expectations, not optional extras.

A schematic compliance checklist dashboard for a personality assessment, with audit, validation, and oversight rows, in the Team Building Bot house style

How to keep a personality test defensible

If you are buying or running a Big Five assessment for hiring, a handful of moves cover most of the risk.

Demand real validation evidence. Ask the vendor for the technical manual, not a marketing claim. It should show criterion-related validity for job families like yours, and prove the tool measures normal-range Big Five traits rather than acting as a disguised clinical test or a type indicator.
Validate locally for the role. Traits are job-specific. High extraversion may help in enterprise sales and do nothing for a careful audit role. Run a job analysis and select the traits that are genuinely job-related. A single generic profile applied to every role is hard to defend.
Never let the test be the sole gate. Use the personality score as one input in a multi-step process anchored by a structured interview. It should inform a human decision, never be an automated disqualifier on its own.
Monitor your own funnel. Do not stop at the vendor’s normative data. Track selection rates by race, sex, and ethnicity in your live pipeline against the four-fifths rule, and against local rules like Local Law 144.
Keep it out of medical territory. The instrument should ask about workplace behaviour and preferences, with nothing probing mental health history or pathology. That is the Karraker line.
Offer accommodations and notice. Tell candidates a psychometric tool is being used, explain what it measures, and give a clear, non-punitive route to request an alternative format. This is required practice under the ADA, the UK Equality Act, and the newer US state rules.
Audit the algorithm if the tool is AI-scored. Move past the four-fifths rule and check that the system is not using communication style, employment gaps, or location as proxies for protected traits. Update vendor contracts to include audit rights and clear data obligations.

None of this is exotic. It is what a defensible selection process has looked like for years, applied with a little discipline and written down.

Where this fits for a team, not just a hire

Selection is one use of the Big Five. It is not the only one, and it is not the highest-frequency one for most L&D and people teams. The same validated framework that makes the Big Five defensible in hiring also makes it useful long after the hire, when the question shifts from “should we bring this person in” to “how does this team actually work together.”

That second question is the one Team Building Bot is built for. The bot joins online team sessions, listens for the behavioural signals the Big Five describes, and produces a Big Five-based report the team and the facilitator can debrief against. It is a development tool, not a pre-employment screen, and keeping that distinction clean is part of using the framework responsibly. Selection has its own legal weight, as this post has covered. Team development does not carry the same burden, and it is where a trait-based read tends to pay off most, because it gives a team a shared, non-judgemental vocabulary for the rest of the year.

FAQ

Can you legally use the Big Five for hiring in the US? Yes. A normal-range Big Five assessment is lawful in pre-employment selection. It is subject to Title VII disparate-impact rules and the EEOC’s Uniform Guidelines, but because Big Five traits produce negligible subgroup differences, the assessment rarely triggers adverse impact in the first place. The main conditions are that it stays out of clinical territory and is not used as a sole automated disqualifier.

Does a Big Five test cause adverse impact? Very little, compared with the main alternatives. Cognitive ability tests carry a large Black-White subgroup difference, historically near a standard deviation. Conscientiousness and the other Big Five traits show subgroup differences close to zero. That is the central compliance argument for using the Big Five rather than leaning on cognitive testing alone.

Is a Big Five personality test a medical exam under the ADA? No, provided it measures normal personality and cannot be used to diagnose a mental health condition. Clinical instruments are different. In Karraker v. Rent-A-Center, the MMPI was ruled a medical examination because it screens for conditions such as depression, which made its pre-offer use unlawful. A standard Big Five instrument does not have that problem.

Is the Big Five better than MBTI for hiring? For selection, yes. The Big Five has decades of peer-reviewed evidence linking traits, especially conscientiousness, to job performance. MBTI was not designed for performance prediction, and its publisher advises against using it for hiring decisions. A comparison of the frameworks is in our Big Five versus MBTI versus DISC post.

Do we need a validation study to use a personality test in hiring? If the test ever causes adverse impact, yes. Once the four-fifths rule is tripped, the employer has to show the test is job-related and consistent with business necessity, and that means a validation study. Even where adverse impact is unlikely, a local validation study is the cleanest evidence that the traits you score are genuinely relevant to the role.

What extra rules apply to AI-scored personality assessments? Several. The EEOC holds the employer liable for disparate impact even when a vendor’s algorithm produces it. NYC Local Law 144 requires annual independent bias audits and candidate notice. California and Illinois have added record-keeping, notice, and anti-proxy requirements. In the EU, AI-driven recruitment tools are high-risk systems under the AI Act, with documentation and human-oversight obligations.

Sources

Barrick, M. R., and Mount, M. K. (1991). The big five personality dimensions and job performance: a meta-analysis. Personnel Psychology, 44(1), 1–26.
Hurtz, G. M., and Donovan, J. J. (2000). Personality and job performance: the big five revisited. Journal of Applied Psychology, 85(6), 869–879.
Sackett, P. R., Zhang, C., Berry, C. M., and Lievens, F. (2022). Revisiting meta-analytic estimates of validity in personnel selection: addressing systematic overcorrection for restriction of range. Journal of Applied Psychology, 107(11), 2040–2068.
Schmidt, F. L., and Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124(2), 262–274.
Karraker v. Rent-A-Center, Inc., 411 F.3d 831 (7th Cir. 2005).
Government Legal Service v Brooks [2017] UKEAT/0302/16/RN. UK Employment Appeal Tribunal.
EEOC. Uniform Guidelines on Employee Selection Procedures (1978), 29 CFR Part 1607.
Society for Industrial and Organizational Psychology. Coverage of the Sackett et al. (2022) validity reanalysis, The Industrial-Organizational Psychologist.
Regulation (EU) 2016/679 (General Data Protection Regulation), Article 22; Court of Justice of the EU, SCHUFA Holding (Case C-634/21, 2023).
Regulation (EU) 2024/1689 (Artificial Intelligence Act), Annex III.

#big-five #hiring #compliance #personality #recruitment #reporting

See a validated Big Five read in practice

Team Building Bot joins your online team sessions, listens for behavioural signals, and produces a Big Five-based report for the debrief. It is a development tool, not a hiring screen, and it is free during beta.

menu_book Read more posts