#tech-nugget - D365 / Copilot Studio - Speech & DTMF - I want control...

Contact Centre

September 6, 2025

#tech-nugget – D365 / Copilot Studio – Speech & DTMF – I want control…

Disclosure: All information is accurate at the time of writing this article, things change, we change,
vendors change (and we all love them for it)… take everything with a pinch of salt, if you like salt…

You also might spot some AI generated images using Copilot, it’s so fun to use, how can I not do that….

Now, call me old-fashioned, but I usually live by the principle that if you can do something, it doesn’t always mean you should. What we should be doing is gathering the requirements first and then ensuring what’s possible is the right thing to do, and involving everyone who needs to be!

DTMF tones will always be around, as many would have you believe, with them being replaced by new ways of intent detection; they are a fundamental fallback to either generational or technological requirement.

This is one thing that divided an individuals understanding of copilot studio and what the capabilities of it are, it’s a powerful piece of kit and many fun things can be done with it, but it does still require some out the box thinking as, many assume it is a ‘click – finish’ creator, we still need to understand variable management, best practice and basic development logic flow to really make the use of it.

Like most content, you may have seen this around a few times on blogs or forums, but I do like to note down my context around the stuff I put on here, and it gives me an excuse to have a beer while I’m writing – every cloud has a silver lining.

So, onto the fun bits. In the voice world, without sounding like teaching anyone to suck eggs, voice is the primary driver of contact centre interactions. We are seeing shifts culturally and generationally to other forms of communication, granted, but the primary inbound channel is still voice…

IVRs are still in use today, some more complex than others, and can be the most frustrating thing you encounter. However, having the ability to press a button for an option is key, as mentioned above.

In D365CC, we utilise the studio bots to generate and craft our customer journey, directing them to the necessary steps, collecting data to inform our actions, and classifying it. DTMF is no exception.

Currently, the issue we have is that, as it’s classified as a tone, it comes under the speech way of working in the flow designer. This means you can and are forced to use a mixture of Speech and DTMF in your Question nodes. I won’t go into too much detail on how you do this. I may do a series of videos later on, but for now, this is how it works in the studio.

The question I often encounter is whether to use speech recognition in conjunction with IVR within the same flow. This combination can lead to problems, as background noises, such as tones or coughing, can disrupt the recognition process, frustrating users. While there are options in the flow designer for silence suppression and noise adjustments, these options aren’t perfect, especially in areas with poor signal. Therefore, I prefer to use pure DTMF tones.

Personal View: There is a wave going on with the next level of technology and what we can do with it, ideally the whole AI intent recogition and Unified Routing should be the way forward, but with so many places just not having data sets to work off, this ‘old’ way is a still a great step to build it up, then start to use the new golden eggs, I can assure you, regardless of what you see and hear with what’s being done, only 10% of it is valid, we’re still learning and importantly, the vendors are still trying to match the market – slow and steady is the way forward, Rome wasn’t bullt in a day!

I find it frustrating that Copilot Studio and parts of the Microsoft stack are marketed as user-friendly when, really, a basic understanding of variables and some development work is still required. This often leads to searching online for specific answers, highlighting the importance of informative blog posts (like this one….).

One key variable is the ability to enforce DTMF inputs only, essentially stopping voice input in a Question node and then allowing the flow to continue, with interrupted and Barge-in functions using only the tones and keypad. Additionally, we can make adjustments mid-flow, starting with DTMF inputs and then re-enabling voice input for the customer. This flexibility helps fine-tune the user experience.

Here’s how to do it:

Using the Set variable value, we can change what we like as part of the conversation. This is not the only thing we can do, it’s worth a look around what other delights can show, but for now, these are only a bit of the cake we need, we are looking for OnlyAllowDTMF

NOTE: It is a conversation flow variable, which means it will only affect the flow you are currently in. If we leave or go outside, it won’t have the same value. Therefore, this will enable the flow to only accept DTMF in the remaining logic, thereby eliminating the voice-based aspect of the process. What we can then do is still use the voice readouts in the Questions without accepting voice from the user, “hazaah”, clever fun time stuff.

Suppose you throw this immediately after the primary trigger. In that case, it will set it for the entire conversation, which is why you can then unset it (set to false) when you want to allow it later, allowing for flexibility in nesting the flows.

No more couging in the background, or random voices to disrupt the speech side of things… it happens, believe me, not that I usually call nested IVR’s while in the pub…

Two more tips:

1. Each node will have a barge-in option to allow customers to stop the flow when they get/select their choice. Don’t forget these, as they can easily be missed in your workflow and may lead to complaints if you have lengthy text in a single node.

2. Using the Speech & DTMF opt-in will automatically add a read-out of the options for voice at the end of your own text; it’s hidden to turn off, and just another thing to remember to do, but needed if you’re just wanting that voice input disabled. Why offer it if so? – Per node (at the moment), so check them all!

Also, a hat off to Neil Parkhurst, who really goes the extra mile with this stuff as well; many blogs just have different twists!

Until the next #tech-nugget, people!

Ben