Sim-to-Real Transfer of Robotic Control with Dynamics Randomization


Created: 12 Jan 2023, 02:37 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a") Tags: knowledge, KnowledgeSharing


https://arxiv.org/pdf/1710.06537.pdf

Sim2Real

Sim-to-real errors — normally errors come from either dynamics mismatch, or input mismatch (features are different)

  • Action space needs to be the same i.e. need to have same type of robot, same DOF

Randomisation - done in the dynamics - physical properties of the robot? Need to random based on some lower upper bound and some scale too, and this is known based on prior experience / physics phenomena

Policy conditioned on state, past action of the robot (gives some underlying knowledge of the robot dynamics), goal

Past action is some lstm to model the previous action (e.g. if previously was high friction action, now next step might need to change the policy to something else)

  • e.g. in testing, past state and action is given each time, can match the closest past experience in the current situation in terms of the dynamics

What is on-policy and off-policy in RL?

  • Online vs offline policy?

What is DDPG?

What is PPO?

Why need for deep RL in robot arm?

Other way is to learn the robot dynamics in an inverse way via the inputs and outputs — function approximation

  • Problem is that if you don’t sample enough you might reach a point that is out of bounds
  • Lazy way of doing inverse dynamics - proper way is from the skeleton to get the value of the functionsb