Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/65891
Type: Artigo de Evento
Title: Building the first English-Brazilian Portuguese: corpus for automatic post-editing
Authors: Felipe de Almeida Costa
Thiago Castro Ferreira
Adriana Silvina Pagano
Wagner Meira Junior
Abstract: This paper introduces the first corpus for Automatic Post-Editing of English and a low-resource language, Brazilian Portuguese. The source English texts were extracted from the WebNLG corpus and automatically translated into Portuguese using a state-of-the-art industrial neural machine translator. Post-edits were then obtained in an experiment with native speakers of Brazilian Portuguese. To assess the quality of the corpus, we performed error analysis and computed complexity indicators measuring how difficult the APE task would be. We report preliminary results of Phrase-Based and Neural Machine Translation Models on this new corpus. Data and code publicly available in our repository.
Subject: Linguística de corpus
Tradução mecânica
language: eng
metadata.dc.publisher.country: Brasil
Publisher: Universidade Federal de Minas Gerais
Publisher Initials: UFMG
metadata.dc.publisher.department: FALE - FACULDADE DE LETRAS
Rights: Acesso Aberto
metadata.dc.identifier.doi: https://doi.org/10.18653/v1/2020.coling-main.533
URI: http://hdl.handle.net/1843/65891
Issue Date: Dec-2020
metadata.dc.relation.ispartof: International Conference on Computational Linguistics
Appears in Collections:Artigo de Evento

Files in This Item:
File Description SizeFormat 
Building the first English-Brazilian Portuguese corpus for automatic post-editing.pdf161.91 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.