Uma abordagem baseada em tipicidade e excentricidade para agrupamento e classificação de streams de dados

In this thesis we propose a new approach to unsupervised data clustering and classification. The proposed approach is based on typicality and eccentricity concepts. This concepts are used by recently introduced TEDA algorithm for outlier detection. To perform data clustering and classification, i...

ver descrição completa

Na minha lista:
Detalhes bibliográficos
Autor principal: Bezerra, Clauber Gomes
Outros Autores: Oliveira, Luiz Affonso Henderson Guedes de
Formato: doctoralThesis
Idioma:por
Publicado em: Brasil
Assuntos:
Endereço do item:https://repositorio.ufrn.br/jspui/handle/123456789/24360
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Descrição
Resumo:In this thesis we propose a new approach to unsupervised data clustering and classification. The proposed approach is based on typicality and eccentricity concepts. This concepts are used by recently introduced TEDA algorithm for outlier detection. To perform data clustering and classification, it is proposed a new statistical algorithm, called Auto-Cloud. The data samples analyzed by Auto-Cloud are grouped in the form of unities called data clouds, which are structures without pre-defined shape or boundaries. Auto-Cloud allows each data sample to belong to multiple data clouds simultaneously. Auto-Cloud is an autonomous and evolving algorithm, which does not requires previous training or any prior knowledge about the data set. Auto-Cloud is able to create and merge data clouds autonomously, as data samples are obtained, without any human interference. The algorithm is suitable for data clustering and classification of online data streams and application that require real-time response. Auto-Cloud is also recursive, which makes it fast and with little computational effort. The data classification process works like a fuzzy classifier using the degree of membership between the analyzed data sample to each data cloud created in clustering process. The class to which each data sample belongs is determined by the cloud with the highest activation with respect to that sample. To validate the proposed method, we apply it to several existing datasets for data clustering and classification. Moreover, the method was also used in a fault detection in industrial processes application. In this case, we use real data obtained from a real world industrial plant.