As organizations navigate the ongoing COVID-19 pandemic and map out the new normal of work, we’ve had many reach out for guidance on how to deliver voice and audio as an important part of their remote work strategies and help employees maintain productivity. Call center agents, doctors performing telemedicine visits, information workers — so many people, across so many industries, rely on audio communications in their work (and are becoming even more reliant on it).

With more than 15 years with Citrix Consulting, most recently as a Customer Success Enterprise Architect, I’ve interacted with hundreds of customers as a technical SME, guiding and building environments for all types of scenarios. Over the years, the view of voice traffic has shifted from “How can we avoid this?” to “How do we make this successful?” For each customer there are unique architecture and configuration specifics to consider, and one size does not fit all.

In this blog post, I’ll take a high-level look at architectures behind delivering voice audio to remote workers and provide guidance on how to enable your users to work productively in the new era of remote work.

Some Considerations

When tackling an audio-heavy use case, I usually start with the following elements, which will help define requirements and constraints.

  • The voice application(s): Which collaboration software or unified communication solutions do people use? Is it Microsoft Teams, Zoom, Avaya, Cisco, or something else with a specific feature to work with Citrix technology?
  • Security posture: Should the apps and data that users interact be wholly contained within the data center? Are users permitted to connect their endpoint devices from the internet to hosted communications servers?
  • Network bandwidth to users: Are users geographically dispersed (international or domestic) and coming from well-connected sources?
  • Endpoints: Is this a BYOD or corporate-managed solutions? Are employees using thin clients, Windows, Mac, or something else?

Now that we’ve outlined those critical elements, let’s look at some common architectures I’ve seen.

VPN

In the rush to enable remote access, VPN has exploded in popularity. The reason is simple: It’s often the quickest solution to deploy. Give your users instructions to install the VPN client software, enable MFA, connect to the network, and let employees find their resources to work.

Some organizations enable a security posture check for endpoints, but many don’t. Often, they’re moving fast and don’t have time to gather requirements for the level of network access required for each app, so the users gets it all. But as a result, users are freely able to transfer and store data to their local device as a byproduct of connecting in through a VPN tunnel. The diagram above depicts this scenario: apps and browsers installed on the local endpoint connect through the VPN to applications, data, and communications servers. That data can live on the endpoint at the users’ unsecured network.

While it’s possible to create a secure and performant VPN solution, it generally does not yield the best long-term solution; controlling data is nearly impossible, and all apps running at the endpoint need to talk through the VPN to back ends in the data center, which can result in poor user performance.

Out-of-Band Audio through HDX RealTime Optimization

Citrix Virtual Apps and Desktops offers secure remote access where apps and data stay within the data center (on-prem or in a public cloud). For the rest of this post, we’ll focus on Citrix Virtual Apps and Desktops. There are two main deployment options when it comes to audio.

The first deployment method is preferred and involves a media engine on the endpoint to offload voice traffic in an optimized deployment, sometimes labeled as “HDX RealTime Optimized Support.” As the diagram above shows, the application runs within the virtual desktop (or app), where the user continues to interact with through the Citrix Workspace App.

However, when they make or receive a call, the media engine processes that traffic locally. This only works with specific communications systems like Microsoft Teams, Avaya, Cisco Jabber, and Five9. You can find a compatible list here. This typically is the best performing solution because the endpoint directly performs the audio processing, which sometimes is a cloud service anyway, alleviating additional latency and hops to traverse back into the data center. The audio is also delivered in the app’s native format, avoiding additional encoding or decoding.

On the flipside, the redirection is not without drawbacks. First, client software is required on the endpoint. In the landscape of remote work, it can be a challenge to distribute software to corporate managed devices or to give instructions to end users on installing yet another piece of software on their BYOD.

There are also usually varying degrees of functionality among OS platforms (Windows, Mac, and thin clients), so the applicability of the solution may vary across the target user base.

Finally, some unified communications systems require internet-facing edge voice servers. Because the audio is coming from the endpoint device (on the internet), there would need to be some network connectivity back to the data center.

Generic Audio Redirection

Generic audio redirection is an alternative architecture where voice traffic traverses within the Citrix ICA protocol. The virtual desktop takes audio and delivers it to the endpoint through the bi-directional Citrix audio virtual channel, which compresses the audio from the microphone and speakers through preset codecs based on Citrix policy settings. The main benefit is no additional network connections or software components are needed on top of what’s already required to connect to Citrix already — the Citrix Workspace app, the Citrix Virtual Apps and Desktops infrastructure, and Citrix Gateway to secure external access.

There are several configurations and knobs to provide a quality experience, but I see this going wrong for most customers! It’s not uncommon to hear complaints of choppy or digitized audio through Citrix. Ultimately, we are bound by the network characteristics between the user and the virtual desktop, but there are good lessons to learn on improving quality. Here are some tips that can yield significant UX gains:

  • Set audio quality to Medium: By default this Citrix policy is set to High. For voice traffic the recommendation is Medium, which drops the bi-directional limits to 16000Hz and 64kbps network bandwidth. That’s generally sufficient for speech.
  • Enable UDP audio: This is often missed, and just enabling this configuration should make a significant difference. The trickier part is that UDP needs to be enabled at several levels: the Citrix policy; the VDA component; the DTLS enabled on the Gateway to allow UDP 443 from the endpoint; and all the firewall ports in between (UDP 443 externally, UDP 16500-16509 to the VDA, etc.).
  • EDT: While some may think that they’re already covered because they enabled Adaptive Transport for the whole session, which already utilizes UDP traffic, that is not necessarily the case. While EDT runs over UDP, it still is a reliable transport mechanism (although less chatty than TCP, which is why it performs well on low bandwidth, high latency links). As a result, UDP audio still delivers better quality over just pure EDT, but both can and should be combined.

There are many other options, including multi-port ICA, QoS, generic USB redirection, and more. My colleague Jeff Qiu details some configuration options in his blog post on optimizing VoIP performance. He calls out using “Low quality,” but I would say this is that’s an “it depends” setting. If you can get away with it, Medium quality is my preference and will deliver better fidelity. Check out this great Citrix Docs resource on audio features.

SD-WAN

The last architecture I wanted to highlight can increase reliability for remote workers. For business-critical workloads, where lost time means lost revenue, some customers are considering an SD-WAN solution to use multiple internet paths so if a user’s landline goes down, there’s a redundant link like wireless LTE. This is shown in the example diagram above, where the remote SD-WAN has an wireless internet connection through LTE and is hard-wired through the user’s ISP (i.e. Comcast, AT&T, etc.)

Citrix SD-WAN has the ability to transmit traffic across both links and process only the faster of the duplicate packets to promote sustained performance. It can also be layered with virtualization technology to keep the data out of the home network.

Citrix SD-WAN is packaged as a physical device in the home that connects to an SD-WAN appliance in the data center that contains a stateful firewall to permit/restrict access to resources. It’s also possible to deliver voice traffic out of band from the Citrix session and bypass challenges involved with generic audio redirection. Check out the Citrix Tech Zone for more information on how Citrix SD-WAN can support your remote work strategy.

Summary

As is typical with Citrix technology, there are many ways to deliver voice to remote workers. The best solution will depend on factors including app workload, security posture, end-user segmentation, current IT capabilities, business requirements, and more.

Looking ahead, I expect virtualization solutions to play a large role to secure data from exfiltration, provide better performance for chatty apps, and reduce help desk calls through better manageability. There are a couple configuration options, including delivering the audio stream out of band, which can provide a more native effect. But I’m finding that, due to situational constraints (security, manageability, etc.), organizations are looking to figure out how to deliver audio within the Citrix session. The good news? We have the expertise and the technology, and we have helped many customers succeed in adapting to our new normal.

Learn More

Visit the Citrix TIPs page for tips, tricks, and technical takeaways from our Citrix Customer Success engagements on topics ranging from virtualization and networking to mobility and cloud. There, you’ll find links to all our previous webinars, available on demand. And reach out to your Citrix sales rep to learn how we can help you to deploy work-from-home solutions successfully.