Enriching the E2E dataset

Thiago Castro Ferreira; Helena Vaz; Brian Davis; Adriana Silvina Pagano

Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/57496

Full metadata record

DC Field	Value	Language
dc.creator	Thiago Castro Ferreira	pt_BR
dc.creator	Helena Vaz	pt_BR
dc.creator	Brian Davis	pt_BR
dc.creator	Adriana Silvina Pagano	pt_BR
dc.date.accessioned	2023-08-04T20:47:34Z	-
dc.date.available	2023-08-04T20:47:34Z	-
dc.date.issued	2021	-
dc.citation.issue	14	pt_BR
dc.citation.spage	177	pt_BR
dc.citation.epage	183	pt_BR
dc.identifier.isbn	978195408510	pt_BR
dc.identifier.uri	http://hdl.handle.net/1843/57496	-
dc.description.resumo	This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG. We extract intermediate representations for popular pipeline tasks such as discourse ordering, text structuring, lexicalization and referring expression generation, enabling researchers to rapidly develop and evaluate their data-to-text pipeline systems. The intermediate representations are extracted by aligning nonlinguistic and text representations through a process called delexicalization, which consists in replacing input referring expressions to entities/attributes with placeholders. The enriched dataset is publicly available.	pt_BR
dc.description.sponsorship	CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico	pt_BR
dc.description.sponsorship	FAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas Gerais	pt_BR
dc.description.sponsorship	CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior	pt_BR
dc.format.mimetype	pdf	pt_BR
dc.language	eng	pt_BR
dc.publisher	Universidade Federal de Minas Gerais	pt_BR
dc.publisher.country	Brasil	pt_BR
dc.publisher.department	FALE - FACULDADE DE LETRAS	pt_BR
dc.publisher.department	ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO	pt_BR
dc.publisher.initials	UFMG	pt_BR
dc.relation.ispartof	International Conference on Natural Language Generation	pt_BR
dc.rights	Acesso Aberto	pt_BR
dc.subject.other	Ciência da Computação	pt_BR
dc.subject.other	Linguística de corpus	pt_BR
dc.subject.other	Processamento da linguagem natural (Computação)	pt_BR
dc.title	Enriching the E2E dataset	pt_BR
dc.type	Artigo de Evento	pt_BR
dc.url.externa	https://aclanthology.org/2021.inlg-1.18.pdf	pt_BR
dc.identifier.orcid	https://orcid.org/0000-0003-0200-3646	pt_BR
dc.identifier.orcid	https://orcid.org/0000-0001-9754-1425	pt_BR
dc.identifier.orcid	https://orcid.org/0000-0002-5759-2655	pt_BR
dc.identifier.orcid	https://orcid.org/0000-0002-3150-3503	pt_BR
Appears in Collections:	Artigo de Evento

Files in This Item:

File	Description	Size	Format
Enriching the E2E dataset.pdf		173.86 kB	Adobe PDF	View/Open

Show simple item record