How to Identify AI Survey Fraud

by Andrew Hunt, Melia Whiteside, Jeff Whiteside, Shep Zoutewelle | May 16, 2024 | Blogs

Surveys are a cornerstone of market research, but what happens when the data itself becomes unreliable? AI has revolutionized many aspects of our lives, and research is no exception. But with this progress comes a new challenge: AI survey fraud. Have you ever suspected a survey respondent might not be who they seem? Perhaps rushing through questions or giving nonsensical answers. Well, AI has introduced a whole new level of sophistication to survey fraud. This blog post dives into the growing issue of AI survey fraud, exploring how it works, how it impacts research, and, most importantly, what you can do to protect the integrity of your data.

In this article, we’ll cover:

AI Survey Fraud: What is it?
How AI survey fraud impacts research findings
How to identify AI survey fraud
Strategies to prevent survey fraud

Quality in, quality out. We all know that a study’s results are only as good as the data behind it, which is why The Link Group has always put a huge emphasis on data quality. We have an internal team focused on advancing our data quality protection strategies, we’re involved in an industry-wide data quality task force, and we’re often told by our panel partners that we have one of the most rigorous processes for data cleaning that they’ve seen. Due to the rise in use and capabilities of Artificial Intelligence (AI) programs, we’ve seen an uptick in more sophisticated forms of survey fraud. The type of “AI Survey Fraud” that we’ve encountered is much harder to detect, since it doesn’t often get caught up in typical speeding, straight lining, or logic trap quality control checks.

What is “AI Survey Fraud”?

AI survey fraud is a newer type of survey fraud that utilizes some form of AI to aid in completing the survey. While we don’t have full clarity on exactly how the fraud is being executed, we’ve learned a lot about it from several recent studies and conversations with others in the industry. We know that it is happening at a large scale and that often these fraudsters will find a path through your screener through trial and error to allow for easier bulk qualifying. They also seem to have ways to avoid typical quality control traps within the survey and have AI-generated open-end responses that are on topic and sound knowledgeable when you review them at first glance. And from talking with our panel partners, we’ve learned that this type of fraud has been on the rise across the entire industry in recent months.

How does AI survey fraud impact me?

This type of fraud is especially concerning because it’s harder to detect and happening on a much larger scale. Automated quality checks will only get you so far in catching this more sophisticated fraud as it requires hands-on data reviewing to catch it. This is why we’ve adopted several new strategies to deep dive into the data – on top of our already rigorous standards of closely reading every open end for every respondent.

In any given survey we know we’ll have a small percentage of people who we clean out for a multitude of reasons, whether it’s not paying attention, speeding through to get the incentive, or trying to game the system in a one-off survey. AI fraud, however, is coming in at a higher volume and can take over a chunk of your survey responses and quickly fill up your quotas.

In a recent study across multiple countries and target types, we cleaned out ~40% of our responses due to suspicions of “AI Survey Fraud.” While this is certainly on the higher end, this fraudulent data can have significant impacts on your findings. Below is a blinded question from that study where you can see the significant impact this new type of bulk fraud could have on the findings if it was not caught. On this question there was a nearly 60% difference in Top 2 Box ratings between good and bad respondents.

So, what can we do about this issue?

As with learning about all the positive capabilities of AI, learning about how to detect and prevent its use in surveys is constantly evolving. While there is currently no magic bullet for stopping AI from making its way into your survey, the good news is that we know that it exists; we can keep an eye out for the patterns in the data and look even closer during our data review during fielding.

Tips for Identifying AI Survey Fraud

Before Fielding:

Select your survey panels / data sources carefully. The company that you partner with to recruit respondents and send out your survey is the first line of defense. You want to find a partner who is implementing measures to improve data quality (i.e. reCAPTCHA, fraud scoring systems, de-duplication, etc.).
- If you’re using multiple panel sources, ensure that you have a variable to track the data coming from each one. Doing so will help with detecting patterns and potentially identifying a single data source causing most of your quality issues.
Include at least one emotional/empathy-evoking open end (i.e., what was your experience when you were diagnosed with XYZ disease?).
Disable copy-paste functionality through programming or at least track if a respondent uses it.
Include reCAPTCHA, honey pot questions (i.e., and quality control check questions that don’t consistently have the same correct answer, and honey pot questions (i.e., questions that are hidden on a page that a bot would answer but a real human respondent would not be able to see and answer).

Strategies for Preventing AI Survey Fraud

During Fielding:

Clean daily. The AI fraud we have seen comes into the survey in large batches, filling quotas and preventing real respondents from getting through. High frequency cleaning will help avoid this.
Review your data from multiple angles. Start by sorting data by survey start date to aid in recognizing patterns. You can also consider sorting open ends alphabetically, by panel source, by IP address, and by any other device identification your survey platform collects.
When reading open ends, look for these signs of AI survey fraud:
Longer-than-usual open ends. Most real respondents try to get their point across in as few words as possible (you can use Excel formulas to aid in this process).
- Re-phrasing of the question in the answers. Again, most humans don’t take the time to do this.
- Substantial responses for optional OEs, particularly the one at the end where you may ask for feedback on the survey.
- Respondents suddenly seem to have more knowledge on a topic than past respondents (ex. they know several unaided drugs in development and know how to spell them correctly).
- Responses written in third person when it should be in the first person.
Have multiple researchers review open ends. We recommend having more than one set of eyes on at least the open ends within your cleaning file.
Track incidence rate on a daily basis. Create a calculation to track IR since a sudden spike in IR may be a tip off for a batch of suspicious data. In our experience, the AI fraud tends to be set up to ensure the respondent qualifies for the survey, which would increase incidence rates.
Let your panel providers know if you suspect AI fraud. This helps the panel providers investigate further and pushes them to be part of the solution of finding ways to prevent AI fraud from entering panels.

We know that survey fraudsters tend to adapt quickly and create workarounds for our countermeasures; however, we are constantly innovating and expanding our toolbox to stay ahead of the curve. We are constantly reassessing our toolbox to see how successful or unsuccessful certain tactics are at catching AI and always brainstorming new tactics to implement.

AI Survey Fraud is an industry-wide issue, and our goal is to drive awareness and to be a part of the collective push to improve data quality. We will continue to advance our tools to catch this type of fraud and share our learnings as we conduct research on the effectiveness of our new tactics.

If you’d like to keep up with the latest in the industry when it comes to data quality, you can check out https://globaldataquality.org/. Working together as a full market research industry is the best way to push for a future with higher data quality!

If you would like to chat more about anything we discussed in this blog post, please don’t hesitate to reach out.