Noisy interactive voice data cumbersome for extracting categorical information

Posted on September 15, 2011  /  0 Comments

We recently conducted a training and an exercise with Sarvodaya Community Emergency Response Team (CERT) members in Colombo, Matara, Nuwara-eliya, and Ratnpura Districts. This was an action of the feasibility study to enable Freedom Fone with voice-based emergency data exchange (FF4EDXL). The training involved exposing them to the Freedom Fone interactive voice response system. The exercise involved the participating CERT members using the Freedom Fone system to supply answers to a survey. Each response was recorded as an audio file (MP3) through the telephone call and stored in the FF system.

The researchers analyzed the audio files for their quality as well as the accuracy of the CERT members recording a complete response. Every voice recording had some kind of noise that was caused by the electronics (mechanical sound) or background interference (human voices and other environmental noise). This made it quite cumbersome to extract the information from the recordings. In several occasions the researchers had to listen to the audio recording more than once to determine the answer the participant had provided.

Figure 1 shows that on average only 85% of the information pieces could be recovered. A piece of information would mean an answer to a question. There were ten questions total. Given that the intent is to use FF with voice for exchanging emergency information such as incident situational reports. Incomplete information may cause delays and inappropriate assignment of response resources. Typically an emergency coordinator at the incident management hub (in this case the Sarvodaya Hazard Information Hub) would have to call back the CERT member to recover those missing pieces. If the incidents were life threatening, then such uncertainties are intolerable. Furthermore, automatic the transformation of the speech to text would be impossible and unreliable.

The outcome of the training and survey are discussed in the workshop report.

Comments are closed.