Desenvolvimento de um ambiente virtual de tanques para treinamento de agentes inteligentes

In reinforcement learning, an agent is implemented with the aim of learning to perform some specified task in a given environment through the experiences obtained from interactions with that environment. The environment is the essential structure for this learning, since it is there that the funda...

ver descrição completa

Na minha lista:
Detalhes bibliográficos
Autor principal: Machado, Kaíque Gomes
Outros Autores: Dória Neto, Adrião Duarte
Formato: bachelorThesis
Idioma:pt_BR
Publicado em: Universidade Federal do Rio Grande do Norte
Assuntos:
Endereço do item:https://repositorio.ufrn.br/handle/123456789/53481
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Descrição
Resumo:In reinforcement learning, an agent is implemented with the aim of learning to perform some specified task in a given environment through the experiences obtained from interactions with that environment. The environment is the essential structure for this learning, since it is there that the fundamental configurations for training agents are defined. One of these configurations is the choice of reward criteria and the definition of action spaces. Considering a system of two coupled tanks as an environment and the task specified as controlling the level of tank 1, training an agent in this real problem requires great care to avoid possible accidents in the laboratory. Some examples are level overflow, incorrect voltages sent to the pump and possible loss of these tools. Thus, the development of virtual environments is essential for training agents in this type of problem. With this, the objective of this work is to implement a virtual environment with the Gymnasium (Gym) library of a system of coupled tanks to avoid possible accidents in the laboratory and, with its graphical interface, facilitate the comparison of performance of trained agents. For this, the identification of the tank system was used as a strategy for modeling the system through two LSTM (Long-Short Term Memory) neural networks. A network with only one LSTM layer for level prediction (single network) and another network with an LSTM layer for each level (split network). Finally, the results obtained from the training of the single and divided networks are presented, in addition to exposing the results of the Gym environment developed. It is also shown that the split network served the purpose of modeling the tank system with a few millimeters error.