Ben Schwencke, Business Psychologist, Test Partnership

R

This interview is with Ben Schwencke, Business Psychologist at Test Partnership.

Ben Schwencke, Business Psychologist, Test Partnership

Can you tell us about your background in psychometric testing and employee selection, and how you became interested in the implications of AI-based cheating in these areas?

My name is Ben, and I am a business psychologist and psychometrician at Test Partnership. I followed the traditional route early in my career: I completed my BSc in Psychology, my MSc in Occupational Psychology, and started working as a business psychologist during my master's degree.

I have always been fascinated by individual differences between people, i.e., what makes some people excel where others struggle. I have always been frustrated by the predominantly top-down approach to organizational performance, which suggests that high performance is contingent entirely on effective management practices and leadership.

In reality, it's the performance of individual employees that drives organizational success, which means hiring high-potential applicants is paramount to success at every level. As a result, employee selection and assessment very quickly piqued my interest, and I found myself specializing in this area.

However, with the advent and proliferation of AI and LLMs, I began to see candidates undermine this process, using these tools to cheat on their assessments. This represents a major threat to the validity of employee selection decisions, and thus to organizational performance in general.

What was the pivotal moment in your career that made you realize the potential impact of AI on recruitment processes, particularly in terms of cheating?

I consider myself to be an early adopter of ChatGPT and LLM use. As soon as I realized its potential, I began incorporating it into my work as comprehensively as possible. Proofreading, idea generation, systematizing, data analysis — I have absolutely no qualms with people using AI as part of their role whenever it makes sense to do so.

However, when it became apparent that AI could be used to complete someone's assessment, generate a script for an interview, or completely rewrite a person's résumé on the fly, I arrived at two conclusions:

1) AI use can be used to obfuscate low potential, reducing the validity of the selection process. This either means that low-potential candidates actually end up getting hired, dragging down organizational performance. Or, low performers get caught out later in the selection process, meaning that an actual high-performing candidate didn't get invited to the final stages of the process.

2) Cheating with AI in employee selection requires almost no skill with AI whatsoever. No prompt engineering, no personalization, no effort of any kind. This matters, because many apologists for AI use in selection say, "Well, they will use AI in the role, so why not in selection?" In reality, AI use does not imply AI skill; it just means that they have broken the process and slipped through the cracks.

The whole point of employee selection is to identify the best person for the role — the one with the individual characteristics which underpin performance, fit, engagement, and satisfaction in the workplace. Unfortunately, AI-based cheating nullifies that and interferes with a mission-critical organizational process.

Based on your experience, what are the most vulnerable aspects of current psychometric tests and employee selection methods to AI-based cheating?

By far the most vulnerable stage is the initial application stage, using résumés or application forms. A great many organizations still use traditional résumé sifting or grade application form responses as if they were essays. Popular AI models like ChatGPT and Gemini are literally called Large Language Models (LLMs), and their ability to create written content is by far their strongest suit. If an organization is weighing up candidates based on how well-written their personal statements, cover letters, or application form responses are, AI has completely destroyed this method of assessment.

Many organizations have convinced themselves that they can detect when a candidate is using AI, and in many cases you can tell if a candidate has been lazy about it. However, the reality is that virtually every résumé, cover letter, or application form response will be AI-written, AI-enabled, or AI-checked, making the whole process an exercise in futility.

What's worse is that the incentives are very much in favor of AI-based cheating for written content. Writing a cover letter was — and let's be honest — an artificial barrier to test whether or not someone was serious. Now, you can mass-produce cover letters and make them extremely role-specific, removing this barrier. This only rewards people for using AI to mass-produce résumés and cover letters, helping them to easily 10x their number of applications.

Can you share a specific instance where you encountered or addressed AI-based cheating in a recruitment process? How did you handle it?

The form of AI-based cheating that most impacts me is cheating on psychometric assessments. Psychometrics are powerful tools to measure the characteristics underpinning performance and fit in the workplace, but only if the actual candidate completes them. Realistically, just uploading a screenshot of a question is enough to enable ChatGPT to provide an answer, removing candidate potential from the equation.

To combat this, we at Test Partnership developed a suite of highly dynamic assessments which are resistant to AI-based cheating by design. By including game mechanics in the assessment tools, we render screenshots completely inadequate for answering questions, nullifying this avenue of cheating. Moreover, game mechanics allow us to significantly reduce the administration time, leaving candidates with no spare time to print-screen, upload, and request answers from ChatGPT.

Many providers have opted to merely detect or punish AI use in psychometric testing, which is a very suboptimal solution. Not only will many cheaters slip through the cracks, but many of these detection methods are deeply invasive, reducing the candidate experience. Using dynamic assessments, however, makes AI irrelevant again, representing a fairer and more complete solution.

How do you see the role of gamified assessments evolving in response to the threat of AI-based cheating? Can you provide an example of a particularly effective gamified assessment you've encountered?

I can definitely see gamified, dynamic, and interactive assessments becoming the norm for pre-employment testing, especially as AI gets more effective at cheating. Although it saddens me to say, I think the traditional format of assessment is on the way out, with newer formats of assessment gradually surpassing them.

When it comes to cognitive assessments, the benefits of gamified and interactive assessments go far beyond just AI-cheating protection. These assessments are more fun, interesting, less anxiety-provoking, and more mobile-friendly than traditional assessments — all huge advantages. Moreover, game mechanics allow us to create assessments that are more cognitively complex than traditional assessments, as their interactivity simply gives more options for complexity. This means that gamified assessments can be better measures of cognitive ability, and thus predict performance more precisely in the workplace.

Test Partnership's suite of interactive assessments very much follows this format, and we have worked extremely hard to develop assessments that are fair, effective, and valid. Every assessment in our suite is highly robust against AI-based cheating, nullifying any advantage that cheating candidates would hope to enjoy.

In your opinion, how can recruitment professionals balance the need for AI-proof assessments with ensuring fairness and accessibility for all candidates?

It's a great question, and the answer depends on the approach you are taking to AI-proofing. Many providers opt for a detection and deterrent approach, which can negatively impact accessibility and fairness. For example, preventing third-party extensions would limit the impact of AI, but would also prevent candidates from using screen reader software, harming fairness. Increasingly, organizations are considering neurodiversity as part of their talent management processes—and rightly so.

Test Partnership's suite of dynamic and gamified assessments bypasses the need for invasive detection and deterrence, while also being especially useful from a neurodiversity perspective. Traditional assessments are particularly verbally loaded, requiring candidates to read huge amounts of written information. This can be particularly hard for candidates with dyslexia, for example, while also being boring and difficult to focus on for candidates with ADHD. Gamified assessments use game mechanics instead of walls of text, making them more dyslexia-friendly. Similarly, gamified assessments are quicker and more engaging, making them a better option for candidates with ADHD.

As a result, when assessments are chosen specifically to be AI-cheating-proof, you don't need to compromise on fairness and accessibility. Instead, the two can go hand in hand, allowing you to maximize fairness in every sense.

What advice would you give to organizations that are currently relying heavily on traditional, text-based pre-employment tests? How can they transition to more cheat-resistant methods without compromising the quality of their hiring process?

Ultimately, gamified and dynamic assessments are psychometric tests in the truest sense of the word and should be treated the same. Organizations should, therefore, use the same criteria to evaluate the quality of more innovative methods of assessment. You should ensure that the assessment is reliable, valid, and fair, and demand evidence of these three things in the form of a technical manual.

Although gamification is an improvement overall compared to traditional assessments, that doesn't mean that every provider offers equally advanced assessments. Indeed, I have seen many gamified assessments which look and feel advanced, but are psychometrically very primitive. They will still use classic test theory (CTT), give each candidate the same questions, or even forgo all traditional assessment design processes altogether and just make a game with no actual formal assessment.

Test Partnership's dynamic assessments, however, are designed using item response theory (IRT), employ computer-adaptive testing (CAT), and have undergone rigorous reliability, validity, and fairness testing. They enjoy all of the modern advances in psychometrics that traditional assessments do, while also having all the benefits of gamification — the best of both worlds.

Looking ahead, what do you believe will be the next major challenge in preventing AI-based cheating in recruitment, and how can the industry prepare for it?

The thing that worries me most in recruitment is the use of photorealistic avatars in video interviews, both one-way and two-way. Imagine outsourcing the entire interview to an AI, allowing you to interview for hundreds of jobs simultaneously. That would completely kill video interviewing as an assessment method, which would represent a major blow to selection and assessment.

Video interviews themselves are a phenomenal way of assessing candidates across the globe, opening organizations up to talent pools far larger and more diverse than they would otherwise access. They are also a major boon to disabled candidates who would struggle to attend a face-to-face interview, representing a very reasonable accommodation. However, none of this matters if you can't trust the interview itself — and AI interviewee avatars would represent the final death blow.

Initially, I suspect that AI avatars would be relatively easy to spot, perhaps even using an AI model trained to detect them. But eventually, once the avatars become sufficiently advanced, detection may not be reliable, and video interviews may end up becoming obsolete. Instead, face-to-face interviews may go back to being the default, and it's hard to predict what the implications of this would be.

Thanks for sharing your knowledge and expertise. Is there anything else you'd like to add?

The last thing I would like to add is that most candidates aren't going to cheat, especially if told not to. Most candidates just want clarity over what they can and can't do, and the vast majority will try to act fairly and with integrity. My concerns about AI cheating apply to a minority of unscrupulous candidates who would go against the grain and act unfairly.

Generating a tailored résumé using ChatGPT isn't the same as using an AI avatar to complete an interview for you, and most people wouldn't even dream of cheating this blatantly. However, a small proportion of candidates will — and this is a big enough concern to warrant some action. I do not advocate dropping all online screening tools in favor of nothing but face-to-face interviews, even if AI cheating were as rife as people often claim.

Instead, organizations just need to be aware of the risks and the possible solutions to this problem. In practice, the issue of cheating and test integrity has always been a concern, and there have always been people attempting to cheat. So in many ways, nothing has fundamentally changed — you just need to do everything reasonable to protect the integrity of the assessment, and thus preserve its utility in selection.