Uma plataforma distribuída de mineração de dados para big data: um estudo de caso aplicado à Secretaria de Tributação do Rio Grande do Norte
The volume of data stored and accessed daily is growing on a geometric scale. About 2.5 billion gigabytes are generated every day. In addition, 90 % of the world’s data has been produced in the last two years. Many terms have been used to describe this giant volume of stored data in a structured...
Na minha lista:
Autor principal: | |
---|---|
Outros Autores: | |
Formato: | Dissertação |
Idioma: | pt_BR |
Publicado em: |
Brasil
|
Assuntos: | |
Endereço do item: | https://repositorio.ufrn.br/jspui/handle/123456789/27508 |
Tags: |
Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
|
Resumo: | The volume of data stored and accessed daily is growing on a geometric scale. About 2.5
billion gigabytes are generated every day. In addition, 90 % of the world’s data has been
produced in the last two years. Many terms have been used to describe this giant volume
of stored data in a structured or non-structured way. Big Data is one of these terms. For
many researchers, Big Data is the phenomenon where data is produced in various formats
and stored by a large number of devices and equipment. Some efforts have been done
to offer open source tools and frameworks that can handle or provide capabilities that
can deal with and mine this huge amount of data. However, as the nature of the data is
quite diverse, choosing or developing tools to deal with such data becomes a non-trivial
problem. In addition, few tools are able to extract knowledge from the data. In this sense,
knowledge extraction becomes more difficult due to specific characteristics of the data,
such as: the description of a product which is totally flexible and without validation. For
this reason, in certain problem domains, it is necessary to apply data mining techniques
in text attributes to extract standardized values. The main objective of this paper is to
propose a distributed data mining platform for the Tax Administration of Rio Grande do
Norte, which can extract knowledge in a varied way, considering the specific characteristics
of electronic invoices (NFC-e’s). |
---|