Speech is the most important form of communication in our daily lives. At times, our voices can even save us, whether we’re lost in a dense forest or trapped in the wake of a natural disaster. Now, researchers from California State University, Fresno have developed a way to detect voices even over long distances or through barriers using Doppler radar with deep convolutional neural networks.
“The signatures produced through the vibrations of a human’s vocal cords generate unique micro-Doppler signatures,” explained Youngwook Kim, a researcher from California State University, Fresno. “These signatures can then be used to classify and recognize different words and letters.”
Voiced sounds are produced by vibrations in the vocal cords, which are stretched in a horizontal direction across the larynx. These vibrations also resonate in the skin of the human neck as depicted in figure 1.
Figure 1: Structure of larynx and vocal cord.
The transmitted wave from a Doppler radar is reflected by human skin, which allows it to be measured and analyzed. For voiceless sounds, micro-Doppler signatures can be produced with electromagnetic waves from the movement of the lips and tongue. Once the researchers established that the radar was able to recognize words and sounds, they used a wooden plank as a barrier to evaluate its through-object ability and created three measurement scenarios for the study.
In the first scenario, the subject spoke with no barrier. In the second scenario, the barrier was placed in front of the subject and the radar was closely attached. In the final scenario, the barrier remained in place and the radar was placed at a greater distance.
Figure 2: Subject in different measurement scenarios
For each test scenario, the subject performed different musical notes and pronounced each letter of the alphabet 50 times. The Doppler Radar was able to capture these sounds as shown in Figure 3 below.
Figure 3: Micro-Doppler Signatures
In order to understand the captured images, the researchers used a deep convolutional neural network (DCNN). Since DCNNs rely on large datasets to train the parameters of the model, the researchers opted to use the transfer learning. Transfer learning employs an established neural network which had already been trained and tested using other data sets.
After implementing the DCNN, the researchers were able to classify the letters to an accuracy of 99% without a barrier and 96% with a barrier in place.
“This result points to the possibility that a radar system could be used for the detection of human voices over a long distance,” said Youngwook Kim. “This could be a life-saving tool used in search and rescue operations.”
While the researchers achieved strong results, the practical implementation of this technology would need to be carefully considered, as diverse factors including power, range of antenna, and frequency have a significant impact on the radar’s capability.
To learn more about radar, visit the IEEE Xplore digital library.