Enriching the E2E dataset
Carregando...
Data
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Minas Gerais
Descrição
Tipo
Artigo de evento
Título alternativo
Primeiro orientador
Membros da banca
Resumo
This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG. We extract intermediate representations for popular pipeline tasks such as discourse ordering, text structuring, lexicalization and referring
expression generation, enabling researchers to rapidly develop and evaluate their data-to-text pipeline systems. The intermediate representations are extracted by aligning nonlinguistic and text representations through a process called delexicalization, which consists in replacing input referring expressions to entities/attributes with placeholders. The enriched dataset is publicly available.
Abstract
Assunto
Ciência da Computação, Linguística de corpus, Processamento da linguagem natural (Computação)
Palavras-chave
Citação
Curso
Endereço externo
https://aclanthology.org/2021.inlg-1.18.pdf