DTMF payments are transactions where the customer enters their card details on a phone keypad during a call — and the digits are captured securely so they never reach the agent, the call recording, or the business systems.
When you enter a card number on a phone, each key press generates a tone. That's DTMF (Dual-Tone Multi-Frequency). For decades those tones travelled openly through the call, exposing card data to anyone listening, recording, or transcribing. The job of a secure DTMF payment flow is to make sure the customer's real digits are captured inside a certified payment environment and never reach the business, while the audio the agent and recordings hear carries no decodable card data.
The category uses three terms, DTMF clamping, DTMF masking, and DTMF suppression, that get used interchangeably in marketing copy but are technically distinct, with different PCI implications. This guide covers what each one actually does, when you'd pick one over another, and what to look for in a DTMF payment processing solution.
What Is DTMF?
DTMF stands for Dual-Tone Multi-Frequency. Every key on a telephone keypad emits two simultaneous frequencies — one from a row, one from a column. The combination uniquely identifies the digit. These tones were designed in the 1960s for telephone signalling and have been the universal language of phone systems ever since.
DTMF is what lets you "press 1 for sales" on an IVR. It's also how a customer types in their card number when an automated system asks for it. The tones are audible by design — anyone on the call hears them, and anyone with a basic decoder can read the digits back out of a call recording.
That's the problem DTMF payment systems exist to solve.
The DTMF Payment Problem
If you ask a customer to read their card number aloud to an agent, three things happen. The agent hears it. The call recording captures it. And the business is now in scope for some of the most onerous PCI DSS requirements — your contact centre is processing card data, your call recordings store card data, and your agents have access to card data.
DTMF was meant to solve part of this. The customer types instead of speaking, the agent doesn't hear the digits — but the tones still travel through the same audio path. They're still in the call recording. They can still be decoded by anyone with the audio. PCI DSS treats DTMF tones the same as spoken card numbers: cardholder data, in scope, full requirement set applies.
The fix is to intercept the DTMF before it reaches the agent or the recording, and that's where the three approaches diverge.
DTMF Clamping vs Masking vs Suppression
The terminology has drifted in the market, but here's what each technically means.
DTMF Clamping
Clamping is the most aggressive intervention. The system replaces the customer's actual DTMF tones with flat, neutral tones (typically a single low-frequency hum) before they reach the agent's audio path or the recording. The customer's real digits are forwarded to the payment gateway via a separate, secured channel; everyone else hears a string of identical, undecodeable beeps.
Best for: PCI DSS Level 1 contact centres where audit defensibility is critical. Clamping leaves zero residue of the real tones in the recording, which is the cleanest scope-reduction outcome.
Trade-off: Requires session-level audio control. Usually deployed at the SIP carrier or session border controller (SBC) — not something you bolt on to a typical CCaaS platform without help.
DTMF Masking
Masking replaces the real tones with substitute tones — often a fixed digit like "0" or a different tone entirely — but unlike clamping, the agent and the recording typically still hear something in the same temporal pattern. The substitution preserves call cadence (so the agent can tell the customer is typing) without exposing the real digits.
Best for: Agent-assisted flows where the agent needs to confirm the customer is making progress, but shouldn't see or hear the actual numbers.
Trade-off: Some implementations leave timing-based side channels. A determined attacker analysing the recording's tone-spacing could in theory infer card length or segment boundaries. Strong masking implementations randomise spacing to defeat this.
DTMF Suppression
Suppression removes DTMF tones from the agent and recording paths entirely — no substitute, just silence. The audio gap is the only signal that the customer typed something. Some providers use the term interchangeably with clamping; others distinguish suppression (silent drop) from clamping (neutral substitute).
Best for: Fully automated IVR or AI voice flows where there's no agent to keep informed and the recording doesn't need conversational continuity.
Trade-off: Less natural in agent-assisted calls. Long silences during card entry can confuse agents and customers; some agents may drop the call thinking the line is dead.
Quick comparison
Approach | What the recording hears | Best fit | PCI scope outcome
Clamping | Flat substitute tones | Live-agent contact centres needing maximum audit defence | Card data fully out of scope
Masking | Substituted digits / patterned tones | Agent-assisted flows with conversational continuity | Card data out of scope (caveat: implementation quality)
Suppression | Silence | Pure IVR / AI voice / automated flows | Card data out of scope
In practice, most production systems blend approaches — clamping for the digit capture, suppression for downstream metadata, masking applied at the recording layer as belt-and-braces.
PCI Compliance for DTMF Payments
PCI DSS does not name DTMF specifically, but treats it as cardholder data the moment it enters your environment. The decisive question for compliance is whether the DTMF tones ever traverse a system you operate, store, or could decode.
If a customer enters their card via DTMF and the tones pass through your CCaaS platform, your call recording system, or your agents' headsets — even briefly — the entire path is in PCI scope. That means PCI DSS controls apply to your network, your storage, your access management, your audit logs, and the call recordings themselves.
If the card is captured before it enters your environment, by a PCI Level 1 service provider handling the secure capture at the point of payment, and the decodable digits never reach your systems, you can claim significant scope reduction. Your contact centre is no longer processing card data. The card is handled in the service provider's certified environment and forwarded to the gateway over a protected channel.
The common SAQ for merchants using a fully descoped DTMF service is SAQ A — the lightest of the self-assessment questionnaires, applicable when all card data handling is outsourced. Some implementations qualify for SAQ A-EP if there are integration touchpoints. The full SAQ D (the heaviest) is what you're trying to avoid.
For documentation: ask any DTMF payment vendor for their Attestation of Compliance (AOC) as a Level 1 Service Provider. If they can't produce one, they cannot give you scope reduction — and you remain in full PCI scope regardless of what their marketing claims.
How DTMF Payment Processing Actually Works
In a typical PCI-compliant DTMF payment flow:
The customer is on a call, with a live agent, an IVR, or an AI voice agent.
The payment step is triggered, handing the card capture to the payment provider's secure, PCI-certified session at the point of payment.
The customer enters their card number on their keypad.
The provider captures the card inside its certified environment, so the digits are captured securely and the call's main audio path carries no decodable card data. The provider forwards the card to the payment gateway over a protected channel.
The gateway processes the transaction, runs the authorisation, and returns a token plus result.
The agent or system sees only the result, approved or declined, with a tokenised reference. No card number ever appears on screen, in logs, or in the recording.
The call continues, with the result confirmed back to the customer.
The secure capture adds a short window to the call versus reading a card number aloud, without the business ever touching the card data.
What to Look For in a DTMF Payment Provider
The market has consolidated around half a dozen serious providers and several dozen white-label resellers. Picking among them comes down to five questions.
1. PCI DSS Level 1 Service Provider designation? Non-negotiable. Anything less means scope-reduction claims won't hold up at audit. Ask for the AOC, not just a marketing claim.
2. How and where is the card captured? The strongest model captures the card inside the provider's own PCI-certified environment at the point of payment, so the decodable digits never enter your stack. Be wary of approaches that lean on stripping or post-processing card data inside your own systems, which tends to reintroduce scope. Ask the provider to be specific about what their environment captures and what, if anything, touches yours.
3. Which carrier or platform does it actually run on today? Secure voice capture usually depends on a specific telephony provider rather than plugging natively into any CCaaS platform. Shuttle's voice capture, for example, runs on Twilio Pay today, so using it for voice means being a Twilio customer; a carrier-agnostic version is on the roadmap for later in 2026. For other platforms (Genesys, Five9, Talkdesk, Avaya, Cisco, NICE CXone, Amazon Connect, Twilio Flex), integrating a given platform may require technical work and, for a packaged build, a paid project. Don't assume a native, finished integration exists for your platform; confirm it.
4. Which payment gateways does it route to? Some DTMF providers are tied to a single gateway (their own or a parent company's). Others are gateway-agnostic. If you have an existing PSP relationship — or want PSP optionality — gateway-agnostic is materially safer.
5. Does it support AI voice agents? A DTMF system designed for live-agent calls may not fit cleanly into an AI voice agent flow where there's no human in the loop. Newer providers explicitly support automated voice channels; legacy ones often don't.
DTMF and the Move to AI Voice Agents
The DTMF problem changes shape when the agent isn't human. AI voice agents — built on platforms like PolyAI, Retell AI, Cresta, or custom LLM stacks — don't need DTMF for the same reasons live agents do. There's no agent listening; the issue is the recording, the transcript, and the AI's own ability to hear and process card data it shouldn't have.
Two patterns dominate:
Pattern A — DTMF in the AI flow. The AI agent invokes a "secure payment" tool that hands control to a DTMF capture provider. The customer types digits as normal. The AI receives only the result (success/fail/token) and continues the conversation.
Pattern B — Voice transcription suppression. The AI agent stays in control but the speech-to-text layer is configured to strip card-shaped sequences before they reach the LLM, with payment capture handled out of band via a side-channel (SMS link mid-call, IVR transfer, agent escalation).
Pattern A maps cleanly onto existing DTMF infrastructure. Pattern B is newer and trickier — it requires confidence that no card data residue survives in transcripts, embeddings, or training data. For most AI voice payment deployments today, Pattern A is the safer architectural choice.
Shuttle's Approach to DTMF Payments
Shuttle is a PCI DSS Level 1 Service Provider. When it's time to pay, the secure card capture takes over at the point of payment, the customer enters their card on the keypad inside Shuttle's certified environment, and the card is forwarded to the merchant's chosen payment gateway over a protected channel. The card never reaches the merchant's platform, recordings, or agents.
How it runs today (be clear on this). Shuttle's voice card capture runs on Twilio Pay today, where Shuttle is Twilio's preferred payments partner. Using Shuttle for voice therefore means being a Twilio customer. The secure capture is scoped to the point of payment; Shuttle riding the full live call, or cleanly returning the caller to the same agent afterwards, is not yet turnkey. A carrier-agnostic version that removes the Twilio requirement is landing later in 2026. For gateways or platforms where voice capture isn't available, payment links (sent by SMS or email, including mid-call) are the turnkey path.
Two further design choices matter for buyers comparing options:
No card storage. Shuttle does not operate a card vault. Tokenisation is handled by the underlying gateway. Shuttle hands the gateway's token back to the merchant but never holds the card data itself. This keeps the blast radius small and avoids putting the merchant in the position of trusting Shuttle as a card storage vendor as well as a transaction router.
Gateway-agnostic. Shuttle integrates with 40+ PSPs, so the capture isn't tied to a specific processor. Merchants can switch gateways without re-implementing the payment layer, and platforms can offer payments to merchants on different PSPs without forcing a switch. One caveat: a few gateways (such as Braintree) won't allow raw card data over voice, so they work for payment links but not for keypad capture.
The full architectural detail is at docs.shuttleglobal.com/docs/twilio-intro.
Frequently Asked Questions
What is DTMF masking? DTMF masking replaces the real keypad tones a customer enters with substitute tones in the call audio path, so the agent and the call recording can't decode the original digits. The real digits are forwarded separately to the payment gateway.
Is DTMF payment processing PCI compliant? DTMF payment processing is PCI compliant when the card is captured inside a certified environment, typically a PCI DSS Level 1 Service Provider handling the secure capture at the point of payment, and the merchant's systems never receive the decodable card data. Implementation matters more than the label.
What's the difference between DTMF clamping and masking? Clamping replaces the customer's tones with flat, neutral audio (no digit pattern preserved). Masking replaces them with substitute tones that preserve some call cadence. Clamping is more aggressive and typically gives stronger PCI scope reduction.
Can AI voice agents take DTMF payments? Yes — modern DTMF capture providers integrate with AI voice agent platforms by exposing a "secure payment" tool the AI invokes mid-conversation. The AI receives only the result; the card data never enters the LLM pipeline.
Do I still need PCI compliance if I use a DTMF service? Yes, but the scope drops dramatically. With a Level 1 service provider handling card capture and your systems never touching the tones, you typically qualify for SAQ A (the lightest self-assessment questionnaire) instead of full SAQ D.
Can DTMF payments work over softphones and VoIP? Yes, with caveats. The interception still has to happen before the tones reach the call recording or agent — which is harder over softphones than over traditional SIP trunks. Confirm with your DTMF provider that they certify your specific softphone/VoIP stack.