AI vishing: Mastering the art of voice deception

Picture an Italian entrepreneur receiving a phone call from their country’s Defense Minister, Guido Crosetto. The politician has an important but challenging ask – he needs the wealthy individual to wire around €1 million to a Hong Kong-based bank account, claiming it’s necessary to free kidnapped Italian journalists in the Middle East. Except it wasn’t really the Defense Minister calling Giorgio Armani and other wealthy Italians, it was a vishing attack.

When hackers get vishing right, it can be a highly lucrative scam. And therefore highly risky for both individuals and organizations to fall for. We’ll walk through how vishing attacks work, how cybercriminals are using AI to make them more believable than ever, and offer some advice on how to stay protected.

What is AI vishing?

Vishing essentially means “voice phishing,” covering social engineering attacks where fraudsters use phone calls to deceive people into giving up sensitive information, such as passwords or banking details. They may also attempt to get someone to make a fraudulent payment, as in the Italian example we just looked at. In the past, this would always have involved a human voice on the other end of the line. In 2025, it’s possible for the voice to be AI-generated, and even an AI clone of a real person’s voice.

Whether using their own voice or an AI-generated one, vishing attackers often pose as representatives from legitimate organizations, such as banks, government agencies, or tech support, to gain the victim’s trust. They may use tactics like urgency, fear, or authority to manipulate their targets into complying with their requests. Vishing can be particularly effective because it leverages the human element, making it harder for automated systems to detect and prevent.

What’s the difference between vishing, phishing, smishing, and quishing?

Phishing, vishing, smishing, and quishing are all forms of social engineering attacks designed to trick individuals into divulging sensitive information, they just differ in the methods used to carry out the attacks. They’re most effective when used together – for example a phishing email may be followed up with a vishing call.

  • Phishing: fraudulent emails that can contain malicious attachments that, when opened, install malware on the victim’s device. Or contain links to fake websites where victims are prompted to enter their login credentials or other sensitive data. Targeted phishing campaigns with specific victims in mind is known as spear phishing.
  • Vishing: the attacker uses voice calls to deceive victims into giving up confidential information over the phone or making a fraudulent payment. The caller ID may also be manipulated to display a trusted or familiar number.
  • Smishing: SMS phishing, where text messages contain malicious links disguised as legitimate ones. The links may download spyware onto the mobile device or take the victim through to a fake login page on their browser.
  • Quishing: less common, quishing is where a deceptive QR code is used. Someone may think they’re scanning a QR code for a legitimate source but will instead be taken through to a malicious website.

Classification of Vishing & Comparison of Social Engineering Attack Methods

Attack Type Method Common Tactics Primary Target Risk Level
Vishing Voice Calls Impersonation of trusted entities to extract sensitive info Individuals & businesses High
Phishing Email Fraudulent emails with malicious attachments or fake login pages Individuals & organizations High
Smishing SMS (Text Messages) Links to fake sites or malware downloads Mobile users Medium-High
Quishing QR Codes Malicious QR codes leading to phishing sites Public & businesses Medium

How does AI generate a voice?

AI can be used to generate a voice through a technology called text-to-speech (TTS) synthesis. TTS systems convert written text into spoken words, making it possible to create realistic-sounding voices that can read out text in a natural manner. This is something computers have been able to do on a basic level for many years. However, it’s now possible to use advanced machine learning algorithms (particularly deep neural networks) to analyze and mimic human speech patterns, including intonation, pitch, and rhythm.

One of the key techniques used in modern TTS systems is called WaveNet, developed by Google’s DeepMind AI. WaveNet generates audio by predicting the raw waveform of the audio signal one sample at a time, which results in highly natural and expressive speech. Other techniques focus on generating mel-spectrograms, which are then converted into audio using a vocoder. These advancements have made it possible to create voices that can be used in various applications, from virtual assistants and customer service bots to more nefarious uses like vishing attacks.

AI can make vishing calls, automated robocalls, and voicemail scams sound very human-like, making it difficult for listeners to distinguish it from a real person. For example, in the case at the beginning of this article, it’s unclear whether Crosetto’s voice was a pre-recorded message or a filter created with AI that allowed the scammer to modify their voice in real time. Microsoft claim a person’s voice can be cloned in just three seconds. It could be possible for a scammer to make a short “can you hear me” call and then clone your voice based on your short reply.

How does a vishing attack work?

A vishing attack typically goes the same way, whether AI is being used to help or not. It starts with a phone call from a fraudster who pretends to be from a trusted organization. They may use convincing language and background noise to make the call seem more authentic. For example, the attacker might say, “This is John from your bank’s fraud department. We’ve detected some suspicious activity on your account, and we need to verify your identity to secure it.”

The victim, feeling anxious or concerned, might then provide the requested information, such as their account number, password, or PIN. Once the attacker has this information, they can use it to commit identity theft, make unauthorized transactions, or gain access to other sensitive accounts.

Typical vishing attack process

That is how a typical vishing attack looks like:

  1. Researching the Victim:

    The attacker conducts research on the victim to gather personal information.

  2. Making the Call:

    They make the phone call, often using spoofed caller IDs to appear legitimate. They may be using AI to modify their voice, or use a recorded message with an AI-cloned voice.

  3. Creating Urgency:

    The attacker might claim there is an urgent issue, such as a security breach, an overdue payment, or a problem with an account.

  4. Emotional Manipulation:

    They will try to appeal to the victim’s emotions, such as trust, fear, or greed, to trick them into providing sensitive information or making a payment. For example, pretending to be a senior colleague who needs an urgent task completing.

  5. Follow-Up Tactics:

    The attacker may leave a voicemail or provide a number to call back, further legitimizing the scam. They could also follow up with a phishing email or smishing SMS.

  6. Exploiting Information:

    Once the victim provides the requested information, the attacker uses it to commit financial fraud or identity theft.

AI Vishing-as-a-Service (VaaS)

Vishing-as-a-Service (VaaS) is a type of cybercrime where fraudsters offer vishing (voice phishing) attacks to other criminals for a fee. This service model allows individuals or groups with less technical expertise to carry out sophisticated vishing attacks by leveraging the skills and resources of more experienced fraudsters. They may also offer AI services to clone specific voices.

These cybercrime models are a significant concern because they lower the barrier to entry for cybercriminals, making it easier for a wider range of individuals to engage in these fraudulent activities. The internet is loaded with videos, images, and audio recordings that cybercrime gangs can use for deepfake scams. How hard would it be for an attacker to find a few seconds of audio from the leader of your organization?

Help your service desk verify user identities, enforce user authentication, securely unlock accounts, and reset passwords

Real-world examples of voice-clone vishing attacks

The impersonation of Guido Crosetto was far from an isolated incident. Here are a few others notable vishing and voice cloning attacks from recent times:

How can you defend against AI vishing attacks?

Signs an employee is being targeted by vishing

Some signs of vishing your end users should watch out for include:

  • Receiving a suspicious robocall in advance of a later call
  • A caller pressuring to make a payment or give up confidential information
  • Poor audio quality or unusual background sounds
  • Overly persuasive language or any other language that doesn’t line up with up how the caller would normally communicate
  • Being called at an unusual time
  • Calls from unknown or unusual numbers. A caller leaving a number that doesn’t match the organization’s listed number
  • Calls from colleagues, your boss, human resources, or partner companies asking for sensitive information or urgent actions in a way they wouldn’t normally
  • Asked to do something unusual like download software or make a payment outside of normal processes

Tips for end users to follow if they think they’ve being vished

If someone suspects they’ve receiving a vishing call, here are some tips to follow:

  • Do not provide or confirm any personal information, workplace, or home address over the phone
  • Let calls from unknown numbers go to voicemail and assess the legitimacy of the message before responding
  • Listen carefully to the caller’s voice for anomalies or odd background noise
  • Take a moment to pause before responding to urgent requests
  • Ask the caller for their name and company phone number to verify their identity
  • Register work phone numbers with the Do Not Call Registry
  • Be cautious of calls from technical support asking for remote access or software updates
  • Implement an authentication process for work calls involving sensitive information
  • Explore and enable protection features on work phones to block or filter out spam calls
  • If it’s someone you know, call them back on a number you know if legitimate
  • Verify unusual requests via an additional form of communication

Protect service desk employees from AI vishing and other risks

The vishing attack on MGM Resorts could have stopped earlier with stronger authentication protocols. Specifically, they could have enforced end-user identity verification at the service desk, requiring the caller to provide additional proof of identity. Robust identity verification and authentication protocols, including multi-factor authentication, would have ensured that service desk agents had a reliable method to verify the caller’s identity, blocking the initial entry point and preventing the attack.

With Specops Secure Service Desk, your agents can securely enforce caller verification instead of relying on insecure processes that are prone to human error. Secure Service Desk customers can use authentication methods that remove the opportunity for user impersonation, by requiring verification with something the user has, not just something the user or an attacker may know.

If you want to reduce the risk of vishing at the service desk by enforcing user verification before allowing a password reset or account unlock to be completed, get in touch to see how Secure Service Desk could work in your environment.

FAQs

 You can detect vishing by being wary of unsolicited calls, verifying the caller’s identity through independent means, and never sharing sensitive information over the phone without confirming the caller’s legitimacy.ny of your Active Directory passwords are already weak or compromised? Run a read-only scan with our free tool: Specops Password Auditor.

What’s vishing?

Vishing, or voice phishing, is a type of scam where fraudsters use phone calls to trick individuals into revealing personal information or making financial transactions.

What’s the difference between vishing and phishing?

Phishing involves fraudulent emails or messages, while vishing uses phone calls to deceive victims, often making the scam seem more immediate and personal.

How is AI used in vishing?

AI is used in vishing to create fake voices, make realistic impersonations. and automate calls. This makes it easier for scammers to target a large number of people with convincing and personalized messages.

Can you detect vishing?

You can detect vishing by being wary of unsolicited calls, verifying the caller’s identity through independent means, and never sharing sensitive information over the phone without confirming the caller’s legitimacy.

(Last updated on April 22, 2025)

picture of author marcus white

Written by

Marcus White

Marcus is a Specops cybersecurity specialist based in the UK, with 8+ years experience in the tech and cyber sectors. He writes about authentication, password security, password management, and compliance.

Back to Blog

Related Articles

  • AI arms race: How AI will be used by cyber-attackers (and defenders)

    It’s no surprise that AI’s explosive growth in the last five years has also greatly expanded the need for sophistication and preparation from security threats. While artificial intelligence presents new challenges, machine learning and neural networks also expand security teams’ footprints and ability to keep their companies and customers safe from cyberattacks. Harnessing the power…

    Read More
  • How to stop different types of password attacks

    What makes users and organizations vulnerable to password attacks? In this blog, you will find the most common password attacks along with recommendations that can strengthen password security.

    Read More
  • How to stop O365 phishing attacks

    O365 phishing attacks are are easy – just trick the recipient into giving up their password on a fake login page. Requiring users to authenticate with additional factors is the best way to stop the attack.

    Read More