Bank Systems & Technology is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data & Analytics

12:16 PM
Connect Directly

Sounds of Success

Automated speech recognition technology is clearing the way for voice-generated financial transactions over the phone

Interactive voice response systems, a staple offering in banks, frequently prompt callers to speak up. But hurling expletives at an inflexible, interminable list of menu choices doesn't count for much.

In response, many financial services firms are turning to automated speech recognition, or ASR, a technology that goes beyond the usual rhetoric about listening to the customer. Sophisticated speech recognition software has already taken hold among brokerages, and now has started to wend its way into retail banking applications.

But ASR isn't just about making customers less prone to profanity. Callers can perform tasks in fewer steps because speech recognition permits a larger number of options at each prompt. This shaves one minute from a typical five-minute inquiry, saving banks approximately $0.07 per call at current rates, said Jerry Silva, senior analyst for retail banking at TowerGroup. "The customer service aspects are almost secondary to the cost savings aspects of it," he said.

Much of the savings stems from capturing callers that would have "zeroed out" to a live customer service representative after exhausting either the capabilities of the touch-tone system, of their own rotary-dial phones or of their own patience.

First Union National Bank, headquartered in Charlotte, N.C., plans to deploy an ASR in the fall, using technology from SpeechWorks, Boston, and an IVR system from InterVoice-Brite, Dallas, over a Sprint network. "We're supplementing our staff with experts in user interface design and development from SpeechWorks and InterVoice-Brite, but it's our business analysts and our developers working as a team with these others," said Wendy Beaupre, vice president of vChannels at First Union. "We have an army of people right now testing it."

The bank examined speech recognition technology when upgrading its existing call center hardware and software. "We were going to have to make an investment anyway," said Beaupre. "Nonetheless, the business case still stood on its own."


Capabilities of the First Union system include general banking inquiries and transactions on DDA, savings, credit card and consumer loan accounts. Customers can also order statement copies, reorder checks, hear "mini-statements" or the location of the nearest branch or ATM. But at the same time, touch-tone users can continue to press away. "We're going to promote and hope that they use the speech application, but we're not going to go cold turkey on our touch-tone callers and force them to use it either," said Beaupre. "They're two separate applications-we didn't bother to make a single mixed application to start, as it would suboptimize the speech application."

Although customers can speak into First Union's virtual ear, they won't be able to say whatever comes to mind just yet. "The majority of our applications will be a directed dialogue," said Nancy Staley, vice president and IVR systems manager at First Union. "If you aren't successful at a prompt, it'll give you a better description of what it's looking for so that you have a better chance of answering it correctly the second time."

A directed dialogue prompts the caller to say one of a list of possible options. "We're talking tens of thousands of words in these vocabularies," said TowerGroup's Silva. For example, people can now state the name of a movie to find out the next show time, or the name of a listed company to find out its last traded share price. "ASR has been a killer app for brokerage companies," said Silva. "You know how hard it is to type 'alpha' on the keypad?"


"Natural language" speech recognition lets callers phrase requests in a relatively unstructured fashion. While First Union will use natural language capabilities for some functions, such as at the prompt for moving money between accounts, other banks are pushing the technology into other areas.

Lloyds TSB in the United Kingdom operates in a much different consumer banking environment, with far higher usage of electronic payment systems. "Customers actually pay their bills to their utility companies and things like that using telephone banking, which is not something that occurs in the States," said Anne Gunther, director of telephony and electronic channels. "Therefore, our telephone banking systems are somewhat more comprehensive than you typically find in the U.S."

Lloyds has about 1.3 million customers who regularly speak to a "PhoneBank" agent, 900,000 customers on the "PhoneBank Express," the natural language IVR, and over 1.5 million Internet banking users. "If they want a quick balance, the tendency is to use PhoneBank Express, because it's very, very quick and it's very easy," said Gunther.

"However, customers who want to pay bills are starting to use PC-based banking, because it's easier if you can see what you're doing instead of just hearing what you're doing," added Gunther. "If they've got some queries, or they want to talk about products and services that they need, they're more likely to speak to a person by phone over the PhoneBank manned service."

Lloyds has been working since 1997 with Nuance, Menlo Park, Calif., and Periphonics, Brampton, Ontario, specifically on natural language speech recognition, with customer usability testing from the Center for Communications Interface Research at Scotland's Edinburgh University. "We knew that we had to get beyond the limited vocabulary and/or touch tone because of the things that we wanted our customers to be able to do," said Gunther. "The whole point, of course, is that you can actually speak to it almost as if you were speaking to another human being."

At first, customers aren't sure what the system can do. "When customers first start using the express service they will actually break out to a human, to check their understanding of how something works," said Gunther. "But once they become regular users, then the occasions when they actually need to break out or when the service doesn't work for them are negligible."

Customer service representatives are trained to sell the ASR like they would any other revenue-generating product. "We've actually sold PhoneBank Express to customers who phone PhoneBank regularly, who just ask for a balance," said Gunther. "If you use Express, it takes about half the length of time if all you want is a very simple, quick bit of information."

Once placed into service, the system only requires occasional maintenance. "Essentially, the technology is robust once you have it tuned to your set of accents and so forth," said Gunther. "About every two to three years you need to go through and take a long, hard look at all your recordings and make sure that they're still up to date and that the language is still right."

There are also adjustments that can be made to deal with increasing sophistication of the user base. "For instance, as more customers become expert users, you want it to talk faster," said Gunther. Also, the system's recognition capabilities can be tuned to have a higher recognition rate of customers speaking at a faster clip.

While it would be possible to provide separate recognition engines for beginning and advanced users, Lloyds TSB will allow customers to take the "Fast Path" themselves based on the speed in which they interrupt the prompt. "Over the next six to 12 months, we will increasingly be in a position to allow customers to 'Fast Path,' and let the system check with the customer for bits of information that it's got missing," said Gunther.


The natural complement to natural language speech recognition would be the ability to perform identity verification using voice, as in, "Hi, this is (your name here). Pay my bills, please."

Lloyds TSB and First Union currently authenticate caller identity using a membership number and a PIN. First Union also provides the option to enter a specific account number along with the caller's Social Security number. But both banks are taking steps to incorporate voiceprint analysis into the customer experience.

"We've been looking at voice authentication for quite some time, and next year we'll be looking at it again much more closely, trying to build a business case for it," said First Union's Beaupre. "What we have learned from doing our own analysis and by talking to others is that it makes sense to start with advanced speech and then go to voiceprint, rather than vice versa."

"We're well down the track," said Lloyd TSB's Gunther. "We're about to do some prototyping."

And there's still plenty of room for innovation in speech recognition technology, particularly for services geared to mobile customers.

"Speech recognition has the potential to accelerate the adoption of the wireless telephone as the preferred mobile device for many financial services transactions," said Dennis Behrman, an analyst from Meridien Research, Newton, Mass. "But before the technology can do this, we believe that applications must be able to successfully respond to and resolve complex queries-such as, 'I lost my wallet and I'm two thousand miles from my home. What should I do?'"

Register for Bank Systems & Technology Newsletters
Bank Systems & Technology Radio
Archived Audio Interviews
Join Bank Systems & Technology Associate Editor Bryan Yurcan, and guests Karen Massey and Jerry Silva from IDC Financial Insights, for a conversation about the firm's 11th annual FinTech rankings.