When it comes to hiring, AI has the potential to create greater fairness. However, fulfillment of this potential requires users of AI systems to also understand the current limitations of AI and the major drawbacks of using it to make judgments about people. Affirmity Principal Business Consultant, Patrick McNiel, PhD, considers the “dystopian now” of AI use for the selection and assessment of new hires.
1) AI Makes Decisions in a Black Box
In many cases, how AI arrives at the outputs, decisions, and recommendations it makes in selection scenarios is unknown. The number of mathematical operations and weights these algorithms apply to data to gain predictive utility is so enormously complex, and in many cases unintuitive, that even experts have a hard—if not impossible—time figuring out how and why the systems output what they output. These black box decisions and recommendations are extremely unsatisfying and risky, as explainable decisions are more defensible in court, more palatable to test takers, and in some cases a legal requirement.
The EEOC has offered the following guidance on this subject: “Employers may wish to avoid using algorithmic decision-making tools that do not directly measure necessary abilities and qualifications for performing a job, but instead make inferences about those abilities and qualifications based on characteristics that are correlated with them.” Unfortunately, AI systems often do just that.
To address the problems of the black box, efforts have been made to create explainable AI, which would allow for an understanding of how AI recommendations are arrived at. However, explainable AI is in its infancy and has only recently been attempted in the candidate assessment space.
DATA QUALITY ASSISTANCE FROM THE BLOG | ‘4 Signs That Your Applicant Data Won’t Pass an OFCCP Audit’
2) AI May Utilize Information That Has No Job Relevance
Because AI systems may render black box decisions, it can be exceedingly unclear how the information they use to create scores or make recommendations is related to the job and is consistent with business necessity. Additionally, trends in new laws are making it more important in the candidate assessment context to be able to specify what constructs are being measured and how those constructs have conceptual relevance to the job.
So, if an AI system that operates in a black box is utilized, it better not result in adverse impact. If it does, the defense of pure association with important job outcomes that can be made with more traditional assessment may not be enough. This is because with more traditional hiring assessments it’s easy to see what is driving the relationships, whereas the nature of the black box in AI scoring can prevent this clarity.
For a simple example of how this might go wrong, consider birth location. An AI system might find it can use this information to predict job performance, so it weighs this information subtly and in a non-straightforward way when making recommendations. However, birth location is very likely to contribute to adverse impact against various groups and isn’t at all relevant to most jobs. In such a case, the AI would be acting in a discriminatory fashion that’s indefensible even if its recommendations have strong relationships to positive job outcomes.
What Are “Constructs” in the Context of Psychology and Testing?
A “construct” represents something that we can’t see in the real world like gravity, consciousness, or intelligence. In psychology and testing, a construct typically refers to something abstract that exists in the mind such as a skill, personality trait, or motivational system.
3) AI Systems Used to Make Candidate Assessments Have Tended to Lack Required Documentation
In the past, documentation has been a real problem for AI tools used in a selection context. The uniform guidelines on employee selection procedures (UGESP) specify minimum documentation standards. And recent reviews reveal that many, if not most, AI-based selection procedures had inadequate documentation. This is changing as I-O psychologists are now more widely consulted or included by organizations creating these tools. However, it’s still prudent to ensure AI candidate assessment tools have the required documentation specified in UGESP.
Of particular note is documentation that would be required as AI systems change their algorithms through a dynamic scoring process. Each time an AI changes the algorithm used to score information when making an assessment, a new form of a test is essentially created. This must be noted along with how such changes affect the validity of the assessment. If this is not done, then the “new” test form may lack defensibility.
4) AI Systems Are Being Used Without Transparency
Another issue with the use of AI systems is a lack of transparency. AI systems often work in the background at various phases of a hiring process or as scoring algorithms that create a match to some unknown set of criteria. As a result, applicants may not know if or when they’re being assessed, what they’re being assessed for, or how and when AI is involved in the assessment.
Ethically, these events violate several well-established standards for testing and may cause pushback from candidates. When people are formally judged in a way that will have meaningful consequences for them, they almost universally feel they have the right to know what they’re being judged for. Governments agree with the need for transparency in how AI tools operate in candidate selection and assessment processes, and laws are beginning to address this issue.
HAND-PICKED FOR YOU | ‘4 Reporting Views That Will Help You Better Monitor Your Affirmative Action Outreach Efforts’
5) AI Use May Violate the Social Expectations of Hiring
Use of AI may interrupt common social expectations and lead to poorer candidate experiences. First, people generally have the expectation of appeal. Since AI procedures are often hidden, their recommendations might not even be visible to candidates and therefore cannot be challenged. Second, candidates expect a reasonable opportunity to perform. If AI is focusing on items of questionable job relevance, such as the facial expressions a person is making when no other person is present, then they will likely question the relevance of the assessment.
Third, two-way communication is expected. Talking to or at an AI is contextually different than talking to another human, and if AI replaces humans, then the opportunity for meaningful two-way communication is withheld. And fourth, use of AI may violate moral expectations. Many people believe that letting machines make critical decisions about people is simply wrong.
6) AI’s Effectiveness Is Undermined By a Lack of Representation in Its Training Data
The lack of representation in training data is the Achilles heel of machine learning systems for use in the context of candidate selection. Sampling bias or insufficient sampling in these systems leads to recommendation bias. There is no known way around this. And, it’s very difficult to get an adequate sample for many groups. Even large groups, such as Black and Asian populations, have been sampled insufficiently during the development of AI systems in the past. This is evidenced by several studies:
- A study of 189 facial recognition algorithms that were found to be anywhere between 10 to 100 times more likely to misidentify African American and Asian faces.
- A study of three facial recognition AIs with a gender classification function found a 1% error rate for light-skinned female faces versus 35% for dark-skinned female faces.
- The likelihood of a false positive when using AI programs to match crime suspects to focal photos may be as high as 100 times greater for Black and Asian faces as compared to White faces.
These examples are for facial recognition utilities because facial recognition is one of the most well-studied and advanced areas of AI. Note that, as per McLaughlin & Castro, newer and best-in-class systems have been shown to be far less biased than these statistics indicate. However, this is possibly because sufficient data does exist for different races/ethnicities and genders. Unfortunately, AI use for hire selection is not as well studied, and bias in best-in-class systems is likely to be greater in general. This is especially true for very small groups whose characteristics might affect assessment results (such as for individuals with various uncommon disabilities).
7) AI Selection Tools May Not Be More Accurate or Effective Than Non-AI Tools
Comparative evidence pitting AI-based selection tools against more traditional and time-tested tools is sparse. However, a 2018 study found that AI-based scoring tools were not better than linear regression models for predicting social outcomes (An argument can be made that job performance is largely a social outcome). Regression scoring algorithms also have the advantage of being understandable and transparent.
Additionally, the way machine learning works could introduce a lot of potentially undesirable characteristics into the predictive models AI systems use because they tend to be associated with success or successful people. Characteristics such as a lack of agreeableness, overconfidence, dishonesty, self-focus, dominance tendencies, a high-power motive, and sociopathic tendencies may be inadvertently measured and counted as success indicators by AI systems. This is another reason it’s critical to know what these systems are basing predictions on.
ALSO ON THE BLOG | ‘Affirmative Action Scope: When Is an AAP Required, Who Should Be Included, and How?’
Continue Reading: Consider the Potential Positives of AI for Employee Selection and Assessment
Now you understand the “dystopian now” of AI use, it’s important to circle back and consider what AI could be, and its potential to create positive outcomes in the candidate selection and assessment space. You can get the full picture by downloading our white paper, “The Influence of Artificial Intelligence on Organizational Diversity and Hiring Regulations: The Possibilities and Dangers of the New Tech Frontier”.
In the other chapters of this white paper, we consider not only the potential positives of AI, but also current trends in the usage of AI and machine learning technologies. We also take time to examine the progress that has been made on updating state, national, and international regulatory frameworks to account for AI.
Download the full White Paper today.
Root out bias and adverse impact at key decision points in your talent acquisition processes. Contact Affirmity today to learn more about our talent acquisition process reviews.
About the Author
Patrick McNiel, PhD, is a principal business consultant for Affirmity. Dr. McNiel advises clients on issues related to workforce measurement and statistical analysis, diversity and inclusion, OFCCP and EEOC compliance, and pay equity. Dr. McNiel has over ten years of experience as a generalist in the field of Industrial and Organizational Psychology and has focused on employee selection and assessment for most of his career. He received his PhD in I-O Psychology from the Georgia Institute of Technology. Connect with him on LinkedIn.