Enriching the E2E dataset

Use este identificador para citar ou linkar para este item: http://hdl.handle.net/1843/57496

Tipo:	Artigo de Evento
Título:	Enriching the E2E dataset
Autor(es):	Thiago Castro Ferreira Helena Vaz Brian Davis Adriana Silvina Pagano
Resumo:	This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG. We extract intermediate representations for popular pipeline tasks such as discourse ordering, text structuring, lexicalization and referring expression generation, enabling researchers to rapidly develop and evaluate their data-to-text pipeline systems. The intermediate representations are extracted by aligning nonlinguistic and text representations through a process called delexicalization, which consists in replacing input referring expressions to entities/attributes with placeholders. The enriched dataset is publicly available.
Assunto:	Ciência da Computação Linguística de corpus Processamento da linguagem natural (Computação)
Idioma:	eng
País:	Brasil
Editor:	Universidade Federal de Minas Gerais
Sigla da Instituição:	UFMG
Departamento:	FALE - FACULDADE DE LETRAS ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
Tipo de Acesso:	Acesso Aberto
URI:	http://hdl.handle.net/1843/57496
Data do documento:	2021
metadata.dc.url.externa:	https://aclanthology.org/2021.inlg-1.18.pdf
metadata.dc.relation.ispartof:	International Conference on Natural Language Generation
Aparece nas coleções:	Artigo de Evento

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
Enriching the E2E dataset.pdf		173.86 kB	Adobe PDF	Visualizar/Abrir