Estratégias de seleção de atributos para detecção de anomalias em transações eletrônicas

Rafael Alexandre França de Lima

Use este identificador para citar o ir al link de este elemento: http://hdl.handle.net/1843/ESBF-ACXKTA

Tipo:	Dissertação de Mestrado
Título:	Estratégias de seleção de atributos para detecção de anomalias em transações eletrônicas
Autor(es):	Rafael Alexandre França de Lima
primer Tutor:	Adriano César Machado Pereira
primer miembro del tribunal :	Anisio Mendes Lacerda
Segundo miembro del tribunal:	Gisele Lobo Pappa
Tercer miembro del tribunal:	Wagner Meira Junior
Resumen:	Uma tarefa crucial em detecção de anomalias é a seleção de atributos. Entretanto, o alto desbalanceamento entre as classes gera um novo desafio para realização dessa tarefa. Assim, neste trabalho analisamos estratégias de seleção de atributos para detecção de anomalias. A primeira abordagem realizada consiste na aplicação de 7 métodos de resampling, incluindo um criado neste trabalho, para reduzir o desbalanceamento antes da seleção. A segunda abordagem consiste na avaliação de 8 métodos de seleção de atributos considerados insensíveis ao desbalanceamento entre as classes, além da criação de um método para combinação das métricas. A validação sobre a eficácia dos métodos foi realizada construindo modelos de detecção de fraude, formados por 3 diferentes técnicas de classificação sobre os atributos selecionados pelas distintas abordagens. Para validação desses modelos, realizamos estudos de casos com dados reais, para detecção de fraudes em 2 populares sistemas de pagamentos eletrônico.
Abstract:	Anomaly detection refers to the problem of finding patterns in data that deviates from the expected average behavior. One of the classic scenarios in this area is fraud detection, which consist in learn a fraudulent behavior from a set of observations. In electronic transactions, there is a large amount of information that could be used to detect fraud. Thus, filter this information and choose the most representative of it is a crucial task, known as Feature Selection. The best Feature Selection methods uses the class information to perform this task. However, an important characteristic in fraud detection problems is the high imbalance between the classes. This behavior generates a new challenge to Feature Selection techniques, which tend to select features in favor of the dominant class. Therefore, in this work we analyzed feature selection strategies to anomaly detection in electronic transactions. These strategies were divided in two distinct approaches. In the first approach we applied 7 resampling methods, including one created in this work, to reduce the imbalance between classes before feature selection step. In the second approach we evaluated 8 feature feature selection methods, considered insensitive to imbalance between the classes and we also create a method that uses the concept of Pareto Frontier to combine metrics. The validation of the effectiveness of the methods was performed building fraud detection models. This was performed applying 3 different classification techniques on the attributes selected by different approaches. To validate these models we performed case studies to fraud detection in 2 real dataset from electronic payment systems. We evaluate these models by 3 different metrics. Trough this experiments, we validate our research hypothesis, providing contributions to feature selection area in order to detect fraud. The best models achieved economic gains of up to 57% compared to the actual scenario of thecompany.
Asunto:	Fraude na Internet Detecção de anomalias (Computação) Computação Mineração de dados (Computação)
Idioma:	Português
Editor:	Universidade Federal de Minas Gerais
Sigla da Institución:	UFMG
Tipo de acceso:	Acesso Aberto
URI:	http://hdl.handle.net/1843/ESBF-ACXKTA
Fecha del documento:	15-jun-2016
Aparece en las colecciones:	Dissertações de Mestrado

archivos asociados a este elemento:

archivo	Descripción	Tamaño	Formato
rafaelfrancalima.pdf		3.35 MB	Adobe PDF	Visualizar/Abrir

Mostrar registro completo del elemento Visualizar estadísticas