Reinforcement Learning
An paradigm of ML:
How intelligent agent should take actions to accumulate maximum reward
Given an action by the agent,
- Environment changes state
- Environment gives reward
- Agent can take next action based on knowledge
- Physically based
UnityML
Agent:
- Collection of observations from environment
- Action execution
Brain:
- Decision making for linked agents
- Academy to track iterations, set simulation speed and reset environment.
Academy:
- Tracks iterations
- Global environment variables
Proximal Policy Optimization:
- Used to train advanced models
Model Locomotion
- Force applied is based on acceleration $F=ma$
- each limb being subjected to a force with the limb’s mass
- $F = (v_1 - v_0)m/t = ma$