Please use this identifier to cite or link to this item: http://hdl.handle.net/1843/ESBF-ACXKTA
Type: Dissertação de Mestrado
Title: Estratégias de seleção de atributos para detecção de anomalias em transações eletrônicas
Authors: Rafael Alexandre França de Lima
First Advisor: Adriano César Machado Pereira
First Referee: Anisio Mendes Lacerda
Second Referee: Gisele Lobo Pappa
Third Referee: Wagner Meira Junior
Abstract: Uma tarefa crucial em detecção de anomalias é a seleção de atributos. Entretanto, o alto desbalanceamento entre as classes gera um novo desafio para realização dessa tarefa. Assim, neste trabalho analisamos estratégias de seleção de atributos para detecção de anomalias. A primeira abordagem realizada consiste na aplicação de 7 métodos de resampling, incluindo um criado neste trabalho, para reduzir o desbalanceamento antes da seleção. A segunda abordagem consiste na avaliação de 8 métodos de seleção de atributos considerados insensíveis ao desbalanceamento entre as classes, além da criação de um método para combinação das métricas. A validação sobre a eficácia dos métodos foi realizada construindo modelos de detecção de fraude, formados por 3 diferentes técnicas de classificação sobre os atributos selecionados pelas distintas abordagens. Para validação desses modelos, realizamos estudos de casos com dados reais, para detecção de fraudes em 2 populares sistemas de pagamentos eletrônico.
Abstract: Anomaly detection refers to the problem of finding patterns in data that deviates from the expected average behavior. One of the classic scenarios in this area is fraud detection, which consist in learn a fraudulent behavior from a set of observations. In electronic transactions, there is a large amount of information that could be used to detect fraud. Thus, filter this information and choose the most representative of it is a crucial task, known as Feature Selection. The best Feature Selection methods uses the class information to perform this task. However, an important characteristic in fraud detection problems is the high imbalance between the classes. This behavior generates a new challenge to Feature Selection techniques, which tend to select features in favor of the dominant class. Therefore, in this work we analyzed feature selection strategies to anomaly detection in electronic transactions. These strategies were divided in two distinct approaches. In the first approach we applied 7 resampling methods, including one created in this work, to reduce the imbalance between classes before feature selection step. In the second approach we evaluated 8 feature feature selection methods, considered insensitive to imbalance between the classes and we also create a method that uses the concept of Pareto Frontier to combine metrics. The validation of the effectiveness of the methods was performed building fraud detection models. This was performed applying 3 different classification techniques on the attributes selected by different approaches. To validate these models we performed case studies to fraud detection in 2 real dataset from electronic payment systems. We evaluate these models by 3 different metrics. Trough this experiments, we validate our research hypothesis, providing contributions to feature selection area in order to detect fraud. The best models achieved economic gains of up to 57% compared to the actual scenario of thecompany.
Subject: Fraude na Internet
Detecção de anomalias (Computação)
Computação
Mineração de dados (Computação)
language: Português
Publisher: Universidade Federal de Minas Gerais
Publisher Initials: UFMG
Rights: Acesso Aberto
URI: http://hdl.handle.net/1843/ESBF-ACXKTA
Issue Date: 15-Jun-2016
Appears in Collections:Dissertações de Mestrado

Files in This Item:
File Description SizeFormat 
rafaelfrancalima.pdf3.35 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.