paper

Using voice input/output technologies to support hand-busy execution of onboard ISS crew procedures.

Paper number

IAC-10.B3.7.9

Author

Mr. John Melody, SyberNet Ltd., Ireland

Coauthor

Mr. Christian Knorr, Astrium UK, Germany

Coauthor

Mr. Paul Kiernan, Skytek Ltd, Ireland

Coauthor

Mr. Mikael Wolff, European Space Agency (ESA), The Netherlands

Coauthor

Mr. Frank Plassmeier, Astrium GmbH, Germany

Year

2010

Abstract

Developing multi-modal applications is a new and challenging area for software developers, especially in situations where new modes of interaction need to be integrated into existing operational systems. This paper describes the addition of a voice commanding interface using ASR (Automated Speech Recognition), a user identification interface using SV (Speaker Verification) and a text-to-audio output interface using TTS (Text-To-Speech) to operational components of Lapap Mk II, the command and control interface for crewmembers on board the Columbus module of the International Space Station.

The ODF procedure execution components, deployed on Lapap Mk II, provide astronauts with access to detailed procedure documentation on Columbus sub-systems and experiments. The intent of the project was to provide effective, alternative interfaces for astronauts who need to access this information in “hands-busy” and/or “eyes-busy” situations.

In addition to developing a (VUI) Voice User Interface for the procedure execution components, a new component, called a Voice Assistant, was developed. The Voice Assistant provides an interface between off-the-shelf ASR, TTS and SV resources and components implementing a VUI. The procedure execution components provide the Voice Assistant with continuous context updates so that dynamic, context-sensitive speech grammars can be activated in the ASR engine and relevant procedure text prepared for read-out by the TTS engine. Consequently, the Voice Assistant returns interpreted voice commands to the appropriate components for processing. Users can use all modes of interaction available, in any sequence, with synchronization across the multi-modal interface being maintained.

Testing was carried out over two iterations using actual procedure documents and real-world use-case scenarios. The user test community included male and female testers with a variety of native language backgrounds and whose expertise, in relation to Columbus, ranged from novices to serving astronauts. The testers’ performance in executing procedure steps was recorded as was all their audio interactions with the system. This data was analyzed in conjunction with qualitative feedback provided by the testers.

The implemented system reached an operational recognition success rate of more than 90 percent without the need to change any of the existing operational products (i.e. procedures) and users regarded the voice interface to be intuitive and helpful. The main conclusions identified consistency of voice commands across components, better handling of command synonyms, more elaborate, real-time visual feedback on speech engine results and more pre-processing of procedure content for TTS presentation as key areas of focus for future work.

Abstract document

IAC-10.B3.7.9.brief.pdf

Manuscript document

IAC-10.B3.7.9.pdf (🔒 authorized access only).

To get the manuscript, please contact IAF Secretariat.