In my other blog post, I showed my optimism towards a world with Neural Networks (NN) based autonomous robots by listing the current pain points and their potential lines of solutions around such software systems. The development of NN infused robots has not seen as fast development as in fields like computer vision and NLP, where there are curated datasets and simulators. Anyone who has tinkered with NN for real-robots can attest, it is a different ballgame altogether. While most of the reasons behind the low-adoption of data-driven AI on real-robots add up in my head (for example, the tedious nature of the robotic experiments), I would like to argue that it can be changed for good. In this blogpost, I further exhibit my enthusiasm and optimism for a world of NN based autonomous robots by summarizing my main take-aways from my experiences with robots and the literature posted by companies like Kindred AI.
What is so hard about infusing AI with real robot hardware:
- Robots are expensive to buy and maintain: Doing a robotic experiment can dig a hole in your pocket. The cost of robots has come down drastically but needs to further go down. For example, just building a miniature car laden with the minimalistic sensor suite for autonomous driving costs about 250$. This is particularly important for fields like swarm robotics, which need a lot of them. Also, robots go through frequent wear and tear that increases their costs.
- Robot experiments are tedious: Robot learning techniques like reinforcement learning (RL) and imitation learning need robot engagement with the real world, which is painstakingly slow. Cases, where a manual reset is required, make it even more tedious. What makes it worse is the fact that the sample requirements of such highly capable techniques go in millions (imagine the plight of the operator who is asked to do this in a time-sensitive manner). This makes the off-the-shelf implementation of such algorithms infeasible on a real-robot. Not the mention the hassles of overheating, and wire-tangling that prolong the agony of doing a real-robot experiment.
- Concurrent nature of the world versus the sequential nature of the simulators: One main reason behind why the impressive hard-to-engineer solutions of RL on simulators does not transfer to the real-robots is that unlike reality, the simulators make the robot computations/operation and the world sequential. Simulators comply with the MDP definition by not accounting for the delay of computation between sensing and acting. Hence a robot controller operates on delayed sensorimotor information. As a result, even the SOTA techniques developed on simulators fail on the real-robots.
- Non-synchronization: Depending on the set-up, different data transfer can take an arbitrary variable amount of time to flow from one part to another. Hence, ( unlike simulators ), the real-world moves at an inconsistent rate. This affects the robot performance because the robot processes fail to synchronize with each other. For example, if the transfer of observation data from the robot to the remote processing unit takes more than the action cycle time, then the controller would process the older observation data, even if the robot is somewhere else. Different latencies are more pronounced when the communication between different units is wireless.
Apparently, these problems, to a high degree, are responsible for the lower adoption of data-driven AI in robotics compared to their usage with canned datasets and simulators. Other reasons include the obsession to quickly iterate to AGI, but that would be a discussion of another blog post. One consequence of it is that the research in AI that has the potency to bring about a revolution in robot automation seems to either not work or are not repeatable the real-robots as yet. But is it really that bad? Next, I will walk the readers through some ways that I believe have the promise to cast some light at the end of this autonomous robot tunnel.
Solutions:
- Reporting of the set-up parameters: Performance is sensitive to the set-up. Robotic research should not fail to disclose the nitty-gritty details under which their robots did or did not work. Some of those hardware configuration parameters are:
- The sensorimotor interface between the robot and the learning agent.
- Frequency of operations/cycle time: different cycle times such as sensorimotor information transfer frequency, robot actuation cycle time, and action cycle times should be in sync with each other. They should be slow enough to work on the perceptible changes and fast enough for responsive control.
- Choice of actuation type: different actuation types are position control, velocity control, and torque control. The rule of thumb is the control variable of choice should be the one that has a strong causal relationship with the robot’s state in the immediate future.
- Whether the medium of data transfer is wired/wireless. Usually, there is a limited compute resources available on-board. Hence data has to be transferred to a remote compute facility, which is prone to variabilities and delays in the arrival of both sensorimotor packets and actuation commands. Progress on edge computing seems promising, as frozen models are usually small in size.
- Standardization of the robot set-up: Performance is sensitive to the set-up. To make progress in real robot AI, we got to create MNIST like benchmarks for robotic tasks. Hence standardization of set-up details like the ones that I discussed in the previous point is warranted. Note that in the case of RL, interleaving between training and experience gathering has to be taken special care of.
- Simulators: Simulators on most accounts have some fundamental differences with the reality. Features like zero increase of time between sensing and acting, having infinite start-overs, no wear and tear of the robot are luxuries that the real-world do not possess. To grow robot learning, simulators should not omit these details as they affect the learning big-time. Techniques that can leverage simulators such as domain randomization, few-shot RL/LfD, etc., will be handy then. Also, high-fidelity simulators such as the Gibson to account for simulator-reality incompatibility due to semantic distribution mismatch of objects and lack of photo-realism should gain more traction.
- Cheaper hardware: The cost of robots laden with the minimal sensor suite, sufficient on-board memory and processor, and actuators are not low-cost enough yet. Although improvements have been there lately, such as ANYmal- a quadruped by ANYbotics and MIT Mini Cheetah, it is still out of reach of many.
- Improvements in NN-based robot automation software, especially in terms of sample-efficiency and safe exploration, can drastically reduce the potential accidents involving the robot and its surroundings. Some critical ideas along this line are model-based RL and creating better representation for exploration like unsupervised skill discovery. See this blog post for more information.
A real robot’s engagement with the real world for a quest for intelligent behavior is painstakingly hard. I assert that we should start accounting for factors that make a difference to the real-robot learning. Most importantly, we must not fail to highlight the conditions under which they worked. These steps would lead to productive tensions and arguments that would ultimately lead to warranted benchmarks and standards. One work that recently got my attention is Meta-World that has created a benchmark to evaluate progress for meta-RL and multitask learning on 50 robot manipulation tasks. Other projects like DeepRacer, DeepRacer league, Duckietown, and its AI driving Olympics should be promoted. It is only then we can fully and genuinely start exploiting the benefits of data-driven AI.