In the reinforcement learning paradigm, the learning process is a loop in which the agent reads the state of the environment and then executes an action.
Then the environment returns its new state and a reward signal, indicating if the action was correct or not. The process continues until the environment reaches a terminal condition or it reaches a maximum number of iterations.
These are some of the main concepts in Reinforcement Learning:
The environment is a representation of the context that our agent will interact with. It can represent an aspect of the natural world, like the stock market, or a street, or a completely virtual environment, like a game.
States are observations that the agent receives from the environment. It’s the way the agent gets all available information about the environment.
Actions are performed by the agent and may change the state of the environment. All the rules of how an action changes the state of the environment are internal to the environment. For a given state, the agent can choose its following action, but it does not control how this action will affect the environment.
Rewards signal to the agent if an action was correct or not.
So, if you want to learn more about Reinforcement Learning by a practical example, take a look at the following blog post: