[ad_1]
Too Long; Didn’t Read
wav2vec2 is a leading machine-learning model for the design of automatic speech recognition (ASR) systems. It is composed of three general components: a Feature Encoder, a Quantization Module, and a Transformer. The model is pretrained on audio-only data to learn basic speech units. The model is then finetuned on labeled data where speech units are mapped to text.
[ad_2]
Source link