A New Corpora Revolution: AI Versus Language Barriers With Parallel Data For Machine Translation Systems
by NLPC Team Machine Translation
Parallel data, also known as parallel corpora, refers to collections of translation pairs comprising sentences and their corresponding translations. These datasets are utilized in the training and evaluation of machine translation models. Creation of parallel data can be accomplished through manual, automatic, or synthetic means using monolingual data.