Face attribute representation across the layers, channels and neurons of face recognition neural networks

Matheus Alves Diniz

Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/45143

Type:	Dissertação
Title:	Face attribute representation across the layers, channels and neurons of face recognition neural networks
Other Titles:	Representação de atributos faciais nas camadas, canais e neurônios de redes neurais de reconhecimento facial
Authors:	Matheus Alves Diniz
First Advisor:	William Robson Schwartz
First Referee:	David Menotti Gomes
Second Referee:	Adriano Alonso Veloso
Abstract:	Deeply learned representations are the state-of-the-art descriptors for face recognition methods. These representations encode latent features that are difficult to explain, compromising the confidence and interpretability of their predictions. Most attempts to explain deep features are visualization techniques that are often open to interpretation. Instead of relying only on visualizations, we use the outputs of hidden layers to predict face attributes. The obtained performance is an indicator of how well the attribute is implicitly learned in that layer of the network. Using a variable selection technique, we also analyze how these semantic concepts are distributed inside each layer, establishing the precise location of relevant neurons for each attribute. According to our experiments, gender, eyeglasses and hat usage can be predicted with over 96% accuracy even when only a single neural output is used to predict each attribute. This performance is less than 3 percentage points lower than the one achieved by deep supervised face attribute networks, which indicates that there exists neurons inside face recognition DCNNs encoding face attributes almost as accurately as DCNNs optimized specifically for these attributes.
Abstract:	As representações aprendidas por redes profundas são os descritores estado-da-arte para métodos de reconhecimento facial. Essas representações codificam características latentes que são difíceis de serem explicadas, o que compromete a confiança e interpretabilidade de suas predições. A maior parte das tentativas de se explicar essas características são técnicas de visualização, cuja principal limitação é relativa à sua subjetividade. Ao invés das visualizações, este trabalho propõe a utilização de camadas intermediárias da rede para classificar atributos faciais. A performance obtida por esses classificadores é utilizada como um indicador do quão bem aquele atributo é aprendido implicitamente naquela camada. Essa análise pode ainda ser combinada com uma técnica de seleção de variáveis para estabelecer precisamente a localização dos neurônios relevantes para cada atributo. De acordo com os experimentos, atributos que codificam gênero, utilização de óculos e chapéu podem ser preditos com uma acurácia superior a 96% através da saída de um único neurônio. Essa performance é apenas 3 pontos percentuais inferior a métodos estado da arte que foram supervisionados para predizer esses atributos, o que indica que estes atributos são muito bem definidos dentro da rede de reconhecimento facial.
Subject:	Computação – Teses Percepção visual – Teses. Reconhecimento de faces – Teses Visão computacional – Teses
language:	eng
metadata.dc.publisher.country:	Brasil
Publisher:	Universidade Federal de Minas Gerais
Publisher Initials:	UFMG
metadata.dc.publisher.department:	ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
metadata.dc.publisher.program:	Programa de Pós-Graduação em Ciência da Computação
Rights:	Acesso Aberto
metadata.dc.rights.uri:	http://creativecommons.org/licenses/by-sa/3.0/pt/
URI:	http://hdl.handle.net/1843/45143
Issue Date:	31-Mar-2021
Appears in Collections:	Dissertações de Mestrado

Files in This Item:

File	Description	Size	Format
dissertacao_matheusdiniz_final.pdf		1.63 MB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License