Basic

There are two basic concepts in reinforcement learning: the environment (namely, the outside world) and the agent (namely, the algorithm you are writing). The agent sends actions to the environment, and the environment replies with observations and rewards (that is, a score).

The core gym interface is Env, which is the unified environment interface. There is no interface for agents; that part is left to you. The following are the Env methods you should know:

  • reset(self): Reset the environment's state. Returns observation.

  • step(self, action): Step the environment by one timestep. Returns observation, reward, done, info.

  • render(self, mode='human'): Render one frame of the environment. The default mode will do something human friendly, such as pop up a window.

Last updated