Skip to main content

In today’s fast-paced digital world, customer experience is paramount. Businesses are constantly seeking innovative ways to engage with their customers and provide seamless, personalized interactions on self-service channels. One of the common challenges in customer interactions is the mismatch between customer sentiment and the voice agent’s response. This can lead to frustration and dissatisfaction, ultimately impacting customer loyalty. For instance, a customer expressing frustration might receive a neutral or overly cheerful response from the voice agent, exacerbating the situation. 

One such innovation that addresses this issue is the use of high definition (HD) voices from Azure AI Speech in Dynamics 365 Contact Center, which revolutionizes the way we interact with technology. 

What are high definition voices? 

High definition voices (HD voices) use neural text-to-speech technology to generate highly natural and human-like speech. These voices are trained on millions of hours of multilingual data, enabling them to accurately interpret input text. Then, they can generate speech with the appropriate emotion, pace, and rhythm without manual adjustments. This results in more engaging and lifelike interactions for users. 

Key features of HD voices:

  • Human-like speech generation: HD voices can generate speech that closely mimics natural human conversation, including spontaneous pauses and emphasis. This makes interactions feel more authentic and comfortable for users. 
  • Emotion detection and tonal adjustment: HD voices can automatically detect emotions in the input text and adjust their speaking tone in real-time to match the sentiment. This ensures that the voice agent’s response is empathetic and contextually appropriate. 
  • Consistent voice persona: HD voices maintain a consistent voice from their neural (and non-HD) counterparts, displaying a more natural sounding output.

Generative responses and dynamic content 

As customers embark on their journey towards agentic architectures, they increasingly rely on generative responses to provide responses grounded in comprehensive knowledge sources. Moreover, the ability to generate dynamic content on the fly allows businesses to address a wide range of customer needs without using extensive pre-programmed responses. This flexibility is particularly valuable in scenarios where customer inquiries are complex or varied. The voice agent can provide tailored solutions that are both relevant and timely.  

This shift represents a significant evolution in information available on self-service channels. Generative responses use advanced AI algorithms to dynamically generate content rich and accurate. On voice channels, the personalization is compounded with emotionally aware and engaging voice responses using HD voices. As businesses continue to prioritize agentic architectures, the adoption of generative responses will undoubtedly play a crucial role in delivering more engaging, empathetic, and effective interactions. 

Getting started 

When configuring a voice channel in the Copilot Service admin center, select one of the newly available HD voices denoted by the model type in parenthesis at the end of the voice name. 

voice agent setup in Dynamics 365 Copilot Service admin center

Once you are in Copilot Studio building a voice agent, simply send a message or play a question to a caller, even call upon generative answers, and receive a more emotionally aware response. To increase or decrease the variance in the speech response, simply lower the temperature parameter (ranging from 0 to 1) using Speech Synthesis Markup Language (SSML) where you define a message for the caller. For more variable responses, increase the temperature; decrease the temperature for more stable responses. 

  Here is a test  

HD voices represent a significant advancement in text-to-speech technology. By providing human-like speech generation, emotion detection, and tonal adjustment, HD voices can transform customer interactions and address the challenge of mismatched sentiment. As businesses continue to prioritize customer experience, the adoption of HD voices will undoubtedly play a crucial role in delivering more engaging, empathetic, and effective interactions. 

Learn more

Watch a quick video introduction.

To learn more, read the Azure AI Foundry Blog: New HD voices preview in Azure AI Speech: contextual and realistic output evolved, or read the documentation: Introduction to the voice channel | Microsoft Learn 

Source