Back

15 min read

5 must-know AI concepts for fighting fraud

Posted by Eric Chea and Yan Karklin on Jun 26, 2024

To understand how AI is impacting financial institutions, we must first examine it in the context of financial fraud prevention

Organizations worldwide are bracing themselves for the changes caused by artificial intelligence (AI). As the race to dominate AI continues, Gartner predicts that 70% of enterprises will consider the sustainable and ethical application of AI one of their primary concerns by 2025.

In reality, the sudden AI boom has been anything but: It's long-awaited by computer scientists, after years of start-and-stop progress and "AI winters" characterized by underfunding. But the hype surrounding AI in fraud prevention can sometimes blur the line between reality and exaggeration. As a result, it's hard to know what it even means when a product claims to use AI. This lack of clarity can lead to confusion and skepticism among decision-makers at banks, credit unions, and fintechs.

Today, market intelligence provider Gitnux reports that 85% of financial businesses use some form of AI. Knowing the possibilities and limitations of AI — what it can and can't do — can help bank, fintech, and credit union leaders engage in meaningful conversations about AI with their peers. It can also help decision-makers better discern which AI capabilities are groundbreaking and which are marketing buzzwords.

In this blog post, we’ll break down our AI concepts into three AI techniques: rule-based systems, machine learning (ML), and deep learning. We’ll also discuss natural language processing (NLP) and Generative AI — two applications of AI currently impacting the fraud landscape by empowering fraudsters.

What is artificial intelligence?

Artificial intelligence (AI) is a field of computer science encompassing any technique or method enabling computer systems to perform human tasks. AI includes various approaches designed to mimic human cognitive functions like recognizing patterns, understanding natural language, making decisions, and learning from experience.

In fraud prevention, banks, credit unions, and fintechs can use AI to help analyze vast amounts of identity and transaction data to spot patterns and anomalies that may indicate fraud. Once fraud is detected, the bank, credit union, or fintech can automatically enact additional protection measures like document verification to secure the account.

If trained on incomplete or unrepresentative data, AI systems can also be vulnerable to biases and errors. AI in fraud prevention requires careful design, training, and monitoring to ensure effectiveness and fairness.

Each subset of AI has its strengths and limitations and may be used individually or in combination to enhance a financial organization's fraud detection and prevention capabilities. Understanding the differences between these AI subsets is crucial for making informed decisions about leveraging AI effectively in the fight against increasingly sophisticated fraudsters.

1. Rule-based systems

There’s some debate about whether rule-based systems should be grouped within AI. However, we have decided that it is important to include them for two key reasons:

To provide a snapshot of how decisioning has progressed
To compare rule-based systems against ML

Rule-based systems represent an early attempt at codifying human intelligence using computer programming and if-then statements: If a specific condition (X) is met and a particular action (Y) is performed, then a predetermined result (Z) will occur.

The difference between rule-based systems and other modern forms of AI is that rule-based systems rely on explicitly programmed rules from human experts and can't learn information through experience the way today's AI systems might.

Banks, credit unions, and fintechs have traditionally relied on rule-based fraud detection systems that identify fraud based on a customer's personally identifiable information (PII) and historical fraud data. While these systems can be effective at detecting known fraud patterns and typologies, they may struggle to adapt to new and more sophisticated fraud schemes that emerge over time.

In fraud, rule-based decisioning systems are manually created by fraud and risk experts to identify specific patterns, behaviors, or characteristics indicative of fraudulent activities, such as unusual spending patterns or high-risk locations. When a transaction or event occurs, the rule-based program analyzes the relevant data using the predefined rules. If a rule's condition is satisfied, the program executes the corresponding action, such as flagging the transaction as suspicious, blocking it, or prompting additional verification.

Biggest advantage: Rule-based systems are transparent

Rule-based systems make it easy to understand why a particular transaction or account was flagged. Because rules are explicitly defined, users of rule-based systems benefit from their transparency and interpretability.

Transparency is crucial for compliance and auditing purposes, as it allows organizations to explain their fraud detection decisions to regulators and customers. The transparency of rule-based systems makes them great for compliance purposes because you can design rules to identify specific AML typologies. As a result, rule-based systems make it easy for banks, credit unions, and fintechs to prove they meet regulatory requirements.

Biggest drawback: Rule-based systems don't learn on their own

Rule-based systems may struggle to detect subtle or complex fraud patterns that don't fit neatly into predefined rules. As a result, rule-based systems may miss sophisticated fraud attempts that do not match the predefined patterns. Fraudsters can exploit the rigidity of these systems by carefully crafting their activities to avoid triggering the existing rules, leaving financial institutions vulnerable to losses.

Rule-based systems are static and require manual updates to keep pace with evolving fraud tactics. To maintain the effectiveness of these systems, fraud experts need to constantly monitor fraud trends, analyze new patterns, and manually change rulesets. For this reason, financial organizations often turn to ML to supplement their rule-based systems.

2. Machine learning

ML is a subset of AI that uses data and algorithms to imitate human learning, gradually improving its accuracy over time. These algorithms can learn from experience, grow their capacity to identify patterns, and make decisions, often without requiring that specific conditions be neatly met. Computer scientists may develop ML models through techniques like supervised learning, unsupervised learning, and reinforcement learning.

Applications of ML in fraud detection and prevention include predictive modeling, unsupervised learning for clustering, and anomaly detection. Unlike rule-based models, ML models can handle huge amounts of data and make predictions in real time, making them suitable for processing high volumes of transactions or applications. ML models can also be trained on specific customer segments or behaviors, enabling more personalized fraud detection strategies.

By leveraging ML with identity data, organizations can maximize the number of good application approvals while reducing manual reviews and fraudulent application approvals.

Biggest advantage: ML is adaptable and comfortable with complexity

ML allows financial organizations to extract patterns from data automatically. ML models can adapt to new fraud patterns faster than traditional rule-based systems. These models can also identify complex relationships between variables that may be difficult for humans to discern.

For example, banks, credit unions, and fintechs may use transaction monitoring that leverages isolation forests, an unsupervised ML algorithm used for advanced anomaly detection. Isolation forests can detect deviations from normal transaction patterns by assigning higher anomaly scores to transactions with unusual purchase amounts, locations, or frequencies. This gives fraud prevention teams insight they may not otherwise have access to, allowing for a more proactive approach to fraud risk.

ML models can learn from historical data, automatically adapting as new information is introduced. Unlike rule-based systems, ML models evolve and improve upon their decision-making capabilities as their data supply grows. This reduces the need for manual rule creation and updates. With their time-saving capabilities, ML models allow fraud experts to focus on higher-level strategies and investigations.

Biggest drawback: ML lacks transparency and may perpetuate bias automatically

The downside of ML models is that they can be opaque, making it difficult to determine how they arrived at their conclusions. A lack of transparency can be problematic for compliance and auditing purposes, as regulators often require clear explanations for fraud detection decisions.

ML models can inadvertently learn and perpetuate biases present in the historical data used to train them. For example, if an ML model’s analysis of historical data reveals that certain zip codes are more prone to fraud, then the ML model might learn to exclude those automatically. This is not unlike the rule-based approach, where “risky” zip codes might instead be designated to a list used to deny applications.

It’s in the nature of data and labels to introduce some amount of bias, whether an organization uses ML models or a rule-based approach to fraud risk management. However, rule-builders are more likely to be consciously aware of the potential for other biases when using attributes like zip codes. In contrast, ML systems may act on these biases without human intervention or awareness.

To avoid unintentional bias in datasets, organizations must take a thoughtful approach to selecting and preprocessing training data. They must also implement bias detection and correction techniques, regular monitoring, and model validation to ensure fairness and compliance with regulations. This applies to both rule-based systems and ML models, as the quality and representativeness of the data play a crucial role in the fairness of the outcomes.

3. Deep learning

Deep learning is a subset of ML inspired by the structure and function of the human brain. It consists of "neural networks," which attempt to model high-level abstractions in data by using multiple layers of processing units, also called neurons, to progressively extract higher-level features. Deep learning differs from ML in that it uses neural networks that are more than two layers deep.

While ML excels at tackling structured data and providing interpretable insights, it may struggle with unstructured data like images, audio, and text because it requires extensive manual feature engineering. Deep learning models often undergo pre-training on large, general-purpose datasets and then are fine-tuned for specific tasks with smaller datasets. Pre-training teaches models to understand and represent complex patterns in data, including language structure, visual elements, and other intricacies.

Deep learning also enables end-to-end learning, where a single model can directly map raw input data to the desired output without requiring manual feature extraction or multiple processing stages. End-to-end learning has simplified the development of complex systems like speech recognition, enabling the instant conversion of speech signals into text transcriptions. In fraud prevention, end-to-end learning can power real-time fraud detection, significantly enhancing detection accuracy while reducing the deep learning model's developmental complexity.

Biggest advantage: Deep learning is advanced and replicable

Deep learning can extract features from unstructured data and process complex patterns. As a result, deep learning can detect subtle and intricate fraud patterns that may be difficult to capture with traditional ML techniques. Because of their transfer learning capabilities, deep learning models can carry over knowledge from one domain and apply it to another, reducing the need for extensive data labeling across applications.

Pre-trained models (PTMs) have become powerful building blocks for various downstream tasks in fraud prevention. PTMs, like deep residual learning networks specialized in categorizing images, can be trained on data from pictures of real and fake IDs to identify hard-to-catch red flags associated with synthetic identities. In this case, deep PTMs would analyze the visual information in an ID image to extract relevant details that verify the document's authenticity.

By analyzing vast amounts of historical data, deep learning models can learn how to recognize complex patterns quickly and flag deviations as potential fraud.

Biggest drawback: Deep learning is expensive, and black box

Deep learning models are more computationally intensive than traditional ML models. They require significant computational resources to achieve optimal performance, making them more expensive. They may take longer to develop, and require large amounts of training data.

Additionally, the interpretability of deep learning models can lead to compliance challenges due to their black-box nature. As the Harvard Business Review explains: "Cognitive load theory has shown that humans can only comprehend models with up to about seven rules or nodes, making it functionally impossible for observers to explain the decisions made by black-box systems." Financial organizations need to be able to explain to their regulators why they decide to approve or deny credit and other financial services to their clients, meaning that black-box systems may be untenable for fraud applications.

4. Natural language processing

NLP and Generative AI can be considered AI applications that might leverage rule-based systems, machine learning, and deep learning — the techniques described in the sections above — to perpetuate both anti-fraud and fraud-based activity.

NLP is a subset of AI focused on equipping computers with the ability to comprehend, interpret, and produce human language. It serves various purposes like sentiment analysis, text categorization, named entity identification, and language translation. Rather than fight fraud with NLP, FIs and fintechs are more likely to leverage this form of AI for customer service applications.

Biggest advantage: interpreting unstructured data

While NLP's near-real-time language processing capabilities aren’t precise enough for real-time fraud detection (at least not currently), NLP stands to help alternative data providers gain a fast snapshot of an applicant’s financial health and expedite decisioning. With NLP, alternative data providers may be able to extract key information from unstructured data, including social media platforms, online forums, and customer chat histories. While use cases are still limited, this is important given the increasing relevance of alternative data in areas like fraud detection and credit underwriting.

AI concepts inline 2 — Source: CFA Institute

Biggest drawback: NLP enables AI-driven scams

NLP has significantly cut the costs of committing fraud while helping fraudsters widen their reach. Bad actors working with a strong VPN or a collection of international SIM cards may choose to target victims according to their country of origin, often opting for countries with stronger currencies.

Phishing, romance, and social engineering fraud attacks all hit differently when a fraudster can seamlessly translate their communications across regions using tools like ChatGPT. And once funds from an AI-enabled scam are lost to an overseas account, they may be nearly impossible to recuperate.

Fraudsters can enhance their operations by leveraging NLP to automate and scale their activities using chatbots and bulk email templates. This significantly increases their profit margins, reduces their chances of getting caught, and minimizes the effort needed to carry out far-reaching schemes.

5. Generative AI

AI was built to detect or generate complex patterns. Generative AI was designed to create human-like content by algorithms trained on human-generated data. This can include generating images, text, or music and even simulating human interactions.

Examples of generative AI models include generative pre-trained transformer (GPT) models and diffusion models like Stable Diffusion. GPT models specialize in processing sequential data for language tasks, using a transformer architecture to learn patterns and generate human-like text. Diffusion models, on the other hand, use a fundamentally different approach tailored to image generation. They learn to denoise images gradually, starting from random noise and iteratively refining the output to create realistic and diverse images. Each type of generative model excels in different domains: LLMs like GPT are best for text, and diffusion models work best for images.

Biggest advantage: Generative AI improves data aggregation

One use case for generative AI in fraud prevention is using these models to augment existing datasets and improve the performance of their fraud detection ML models. By training generative models on historical data that includes examples of fraudulent patterns, generative AI models can learn to create realistic synthetic data that mimics the characteristics of fraudulent activities. Synthetic data can supplement the original dataset, providing more extensive and diverse examples for training fraud detection models.

For instance, consider a scenario where an organization wants to train a computer vision model to distinguish between genuine and fraudulent identity documents. If the available dataset contains a limited number of examples or some of the document images need to be of better quality (like if they are blurry or incomplete), then the organization can employ generative AI to enhance the dataset. Training your own model is expensive and not always necessary. Instead, you can leverage pre-trained, open-sourced models or APIs backed by generative AI models as alternatives to utilizing the technology.

AI concepts inline 3 — Source: ResearchGate

By incorporating synthetic examples into the training dataset, the fraud detection model is exposed to a broader range of scenarios, enabling it to learn more robust and generalized patterns. Augmenting data through generative AI helps improve the model's ability to accurately identify fraudulent documents, even in cases where the input images are of suboptimal quality or contain previously unseen variations of fraudulent characteristics.

Biggest drawback: Generative AI works better for fraudsters than fraud-fighters

Generative AI approaches like large language models and diffusion models are incredibly powerful at generating rich, human-like content such as text, images, and audio. However, their strengths lie more in content creation rather than in classification, anomaly detection, or decision support, all of which are key to effective fraud prevention.

While generative AI can augment datasets to improve fraud detection models, as discussed earlier, applying it directly to detect fraud patterns is challenging. Generative models often lack the built-in guardrails and control mechanisms to ensure the quality, reliability, and compliance of their outputs for high-stakes fraud decisions.

However, fraudsters are already leveraging the power of generative AI to execute increasingly sophisticated schemes. With some fine-tuning, bad actors can use generative models to create convincing fake identities, forge documents, bypass authentication systems, or mimic legitimate user behaviors to evade detection — and do so at scale.

Prevent AI-assisted fraud attacks with Alloy's identity and fraud prevention platform

In Alloy's 2025 State of Fraud Benchmark Report, 93% of respondents said that AI will revolutionize fraud detection, and nearly all respondents are already using some form of AI in their fraud controls. The same report found that 64% of the bank, credit union, and fintech leaders surveyed were interested in investing in an Identity Risk Solution — an end-to-end platform that manages identity, fraud, credit, and compliance risk throughout the customer lifecycle.

Alloy enables financial organizations to develop a holistic, unified view of customer risk by using traditional and alternative identity data from over 200 trusted sources.

Our fraud capabilities help clients proactively detect fraud to reduce losses, reduce operational overhead, and minimize disruption to their digital channels. Our suite of actionable AI tools includes our Fraud Attack Radar and Entity Fraud Model.

Fraud Attack Radar is an ML model that helps predict the likelihood of a fraud attack against an organization at origination. Alloy customers can opt-in to proactive notifications of anomalous activity so they can take prompt action to counteract and contain those risks. Meanwhile, our advanced Entity Fraud Model harnesses ML to help predict the likelihood of fraud across an entity's entire lifecycle. This model was trained to identify fraud patterns quickly and proactively using anonymous insights across the Alloy platform.

Experience Alloy's fraud prevention firsthand

Schedule a demo

Eric Chea is an engineering manager at Alloy, leading efforts to power the Alloy product with machine learning capabilities. His recent focus has been on Alloy's Entity Fraud Model. Before joining Alloy, Eric worked on ML primarily in the Healthcare space and on systems allowing researchers to train large ML models.

Yan Karklin is a staff data scientist at Alloy, working on machine learning models of fraud. He has a PhD in Computer Science from Carnegie Mellon University, and experience working on machine learning in fintech, clinical decision support, and edtech domains.

Keep up to date with Alloy

Subscribe to our blog for the latest posts, resources, and events for all things banking and fintech.

Share