My Account Log in

1 option

Analysis of Automatic Speech Recognition Failures in the Car Ford Motor Company

SAE Technical Papers (1906-current) Available online

View online
Format:
Book
Conference/Event
Author/Creator:
Rangarajan, Rangarajan, author.
Contributor:
Amman, Scott
Busch, Leah
Conference Name:
WCX SAE World Congress Experience (2019-04-09 : Detroit, Michigan, United States)
Language:
English
Physical Description:
1 online resource cm
Place of Publication:
Warrendale, PA SAE International 2019
Summary:
In this paper, an approach to analyze voice recognition data to understand how customers use voice recognition systems is explored. The analysis will help identify ASR failures and usability related issues that customers encounter while using the voice recognition system. This paper also examines the impact of these failures on the individual speech domains (media control, phone, navigation, et cetera). Such information can be used to improve the current voice recognition system and direct the design of future systems. Infotainment system logs, audio recordings of the voice interactions, their transcriptions and CAN bus data were identified to be rich sources of data to analyze voice recognition usage. Infotainment logs help understand how the system interpreted or responded to customer commands and at what confidence level. The audio recordings of the voice interaction and their transcriptions provide information about what command is issued by the customer and if it adheres to the grammar of the voice recognition system. The system's interpretation of the command from the logs can be compared to the actual command issued to detect if it is correctly recognized by the system. CAN bus data can help in determining if voice recognition failures occur due to noise sources in the car such as HVAC blower noise, engine noise, et cetera These data sources can also be tied together to detect commands that are incorrectly recognized by the system. When the causes of failures by domain were studied, it was found that the navigation domain was most prone to errors. Natural language understanding and single command navigation would improve the success of the navigation domain. The media control and phone domains were significantly less error-prone. Errors that occurred were largely due to core speech recognition and a majority of those errors could be handled by examining the participant's habits
Notes:
Vendor supplied data
Publisher Number:
2019-01-0397
Access Restriction:
Restricted for use by site license

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account