Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
Created: 12 Jan 2023, 02:37 PM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge, KnowledgeSharing
https://arxiv.org/pdf/1710.06537.pdf
Sim2Real
Sim-to-real errors — normally errors come from either dynamics mismatch, or input mismatch (features are different)
- Action space needs to be the same → i.e. need to have same type of robot, same DOF
Randomisation - done in the dynamics - physical properties of the robot? Need to random based on some lower upper bound and some scale too, and this is known based on prior experience / physics phenomena
Policy conditioned on state, past action of the robot (gives some underlying knowledge of the robot dynamics), goal
Past action is some lstm to model the previous action (e.g. if previously was high friction action, now next step might need to change the policy to something else)
- e.g. in testing, past state and action is given each time, can match the closest past experience in the current situation in terms of the dynamics
What is on-policy and off-policy in RL?
- Online vs offline policy?
What is DDPG?
What is PPO?
Why need for deep RL in robot arm?
- Why not just forward and inverse kinematics
- Cos want the motion to be planned via RL ⇒ reach goal state from start state via RL
- Then why need RL vs something like path planning?
- Cos it is the advancement of path planning algos - https://fab.cba.mit.edu/classes/865.21/topics/path_planning/robotic.html
Other way is to learn the robot dynamics in an inverse way via the inputs and outputs — function approximation
- Problem is that if you don’t sample enough you might reach a point that is out of bounds
- Lazy way of doing inverse dynamics - proper way is from the skeleton to get the value of the functionsb