Uma abordagem baseada em tipicidade e excentricidade para agrupamento e classificação de streams de dados
In this thesis we propose a new approach to unsupervised data clustering and classification. The proposed approach is based on typicality and eccentricity concepts. This concepts are used by recently introduced TEDA algorithm for outlier detection. To perform data clustering and classification, i...
Na minha lista:
Autor principal: | |
---|---|
Outros Autores: | |
Formato: | doctoralThesis |
Idioma: | por |
Publicado em: |
Brasil
|
Assuntos: | |
Endereço do item: | https://repositorio.ufrn.br/jspui/handle/123456789/24360 |
Tags: |
Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
|
Resumo: | In this thesis we propose a new approach to unsupervised data clustering and classification.
The proposed approach is based on typicality and eccentricity concepts. This
concepts are used by recently introduced TEDA algorithm for outlier detection. To perform
data clustering and classification, it is proposed a new statistical algorithm, called
Auto-Cloud. The data samples analyzed by Auto-Cloud are grouped in the form of unities
called data clouds, which are structures without pre-defined shape or boundaries.
Auto-Cloud allows each data sample to belong to multiple data clouds simultaneously.
Auto-Cloud is an autonomous and evolving algorithm, which does not requires previous
training or any prior knowledge about the data set. Auto-Cloud is able to create and merge
data clouds autonomously, as data samples are obtained, without any human interference.
The algorithm is suitable for data clustering and classification of online data streams and
application that require real-time response. Auto-Cloud is also recursive, which makes it
fast and with little computational effort. The data classification process works like a fuzzy
classifier using the degree of membership between the analyzed data sample to each data
cloud created in clustering process. The class to which each data sample belongs is determined
by the cloud with the highest activation with respect to that sample. To validate
the proposed method, we apply it to several existing datasets for data clustering and classification.
Moreover, the method was also used in a fault detection in industrial processes
application. In this case, we use real data obtained from a real world industrial plant. |
---|