Utilização do problema das k-medianas como critério para o agrupamento de dados semi-supervisionado

Clustering is a powerful tool for automated analysis of data. It addresses the following general problem: given a set of entities, find subsets, or clusters, which are homogeneous and/or well separated. The biggest challenge of data clustering is to find a criterion to present good separation of...

ver descrição completa

Na minha lista:
Detalhes bibliográficos
Autor principal: Randel, Rodrigo Alves
Outros Autores: Aloise, Daniel
Formato: Dissertação
Idioma:por
Publicado em: Brasil
Assuntos:
Endereço do item:https://repositorio.ufrn.br/jspui/handle/123456789/22569
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Descrição
Resumo:Clustering is a powerful tool for automated analysis of data. It addresses the following general problem: given a set of entities, find subsets, or clusters, which are homogeneous and/or well separated. The biggest challenge of data clustering is to find a criterion to present good separation of data into homogeneous groups, so that these groups bring useful information to the user. To solve this problem, it is suggested that the user can provide a priori information about the data set. Clustering under this assumption is called semi-supervised clustering. This work explores the semi-supervised clustering problem using a new model: the data is clustered by solving the k-medians problem. Results shows that this new approach was able to efficiently cluster the data in many different domains.