Image Source : OpenAI Blog |
Last year we were amazed by the level of dexterity achieved by OpenAI's Dactyl system which was able to learn how to manipulate a cube block to display any commanded side/face.If you missed that article, read about it here.
OpenAI then set themselves a harder task of teaching the robotic hand to solve a Rubik's cube. Quite a daunting task made no easier by the fact that it would use one hand which most humans would find it hard to do. OpenAI harnessed the power of neural networks which are trained entirely in simulation. However, one of the main challenges faced was to make the simulations as realistic as possible because physical factors like friction, elasticity etc. are very hard to model.
The solution they came up with was a new method called Automatic Domain Randomization which endlessly generates progressively more difficult environments for the simulations to solve the Rubik's cube in. This ensures that real world physics gets covered in the spectrum of environments generated and hence bypasses the need to train the simulations on highly accurate environmental models.
The solution they came up with was a new method called Automatic Domain Randomization which endlessly generates progressively more difficult environments for the simulations to solve the Rubik's cube in. This ensures that real world physics gets covered in the spectrum of environments generated and hence bypasses the need to train the simulations on highly accurate environmental models.
One of the parameters randomized was the size of the Rubik’s Cube. ADR begins with a fixed size of the Rubik’s Cube and gradually
increases the randomization range as training progresses. The
same technique is applied to all other parameters, such as the mass of the cube,
the friction of the robot fingers, and the visual surface materials of
the hand. The neural network thus has to learn to solve the Rubik’s Cube
under all of those increasingly more difficult conditions.
Here is an uncut version of the robot hand solving the Rubik's cube:
To test the limits of this method, they experimented with a variety of perturbations while the hand is solving the Rubik’s Cube. Not only does this test for the robustness of the control network but also tests the vision network, which is used to estimate the cube’s position and orientation. It was found that the system trained with ADR is surprisingly robust to perturbations. The robot can successfully perform most flips and face rotations under all tested perturbations, though not at peak performance.
The impressive robustness of the robot hand to perturbations can be seen in this video:
Here is an uncut version of the robot hand solving the Rubik's cube:
To test the limits of this method, they experimented with a variety of perturbations while the hand is solving the Rubik’s Cube. Not only does this test for the robustness of the control network but also tests the vision network, which is used to estimate the cube’s position and orientation. It was found that the system trained with ADR is surprisingly robust to perturbations. The robot can successfully perform most flips and face rotations under all tested perturbations, though not at peak performance.
The impressive robustness of the robot hand to perturbations can be seen in this video:
No comments:
Post a Comment