Design and Application of Learning-based Tactile Sensing for Soft Finger

Interaction is critically important in robotics research, where the moment of touch, by the humans or by the robots, holds the truth connecting the physical embodiment of robot hardware, the algorithmic computation behind, and the human element involved, all within a specific set of environmental scenarios aiming at task completion. While humans have the skins to develop multiple dimensions of sensory details in one place, it remains a challenge for the robots at the tips of the limbs. Before I joined the Bionic Design and Learning Lab, Dr. Chaoyang Song and other researchers developed the DeepClaw system for learning-based benchmarking, involving a series of hardware and software to integrate robotic systems quickly and effectively for manipulation learning and experimentally evaluate the results for systematic research and development. Dr. Chaoyang Song also invented a novel soft structure with exceptional omni-directional adaption. Based on the DeepClaw system and the novel soft structure, they focused on the qualitative and quantitative analysis of omni-directional adaptation for grasp learning problems in learning efficiency, gripper design, and finger surface optimization. Then I joined the lab and worked on the topic of Design and Application of Learning-based Tactile Sensing for Soft Finger, developing in-finger proprioceptive sensing method to estimate interaction at the tips of robot limbs. Here are two sensing methods we have developed, using optical fibers and vision.

Optical-based Force and Tactile Sensing

To estimate the force and tactile position of the soft finger during rigid-soft interaction, we developed the optical-based sensing method. In 2021, we published a journal article in IEEE Robotics and Automation Letters[1] with a dual-track presentation at 2021 IEEE International Conference on Robotics and Automation (ICRA2021), where we presented an enhanced design of the soft robotic finger with an optoelectronically innervated design integrated as a 3-finger gripper and learned accurate estimation of the contact position and force irrespective of the environmental lighting conditions. While optical integration is engineering capable of providing tactile sensing with high-performing accuracy, the design remains significantly limited to provide refined sufficient output rich enough to cover the whole finger; the optical components are way too large for an integrated solution, and the waterproofing all the electronics involved is neither elegant in design nor effective in practice. Therefore, we turn to develop the vision-based sensing method.

Vision-based Proprioception from On-Land to Underwater

The vision-based sensing method is another solution for proprioceptive learning with the soft, omni-adaptive robotic fingertip. We published the original design in a conference paper at Conference on Robot Learning 2021 (CoRL2021)[2]. In this paper, we present a preliminary implementation of our solution in two ways, one with a fiducial marker fixed inside the finger and the other without; both feature a miniature camera set at the bottom of the finger, capturing visual features of the soft finger’s omni-directional deformations. As a result, we were able to build a neural network to estimate the forces and torques during physical contacts at a high framerate and accuracy within a simple design, comparable to a force-torque sensor with added values in omni-directional adaptation, effectively transforming any rigid gripper into an adaptive and tactile solution that is ready for grasping objects of unstructured geometry. Following this work, we have submitted multiple manuscripts to different journals and conferences, addressing various aspects with demonstrated validations to showcase its potential in robot learning.

We submitted a journal article to the International Journal of Robotics Research[3] in 2023, presenting the soft fingertip as a Soft Polyhedral Network with an embedded vision for physical interactions, capable of adaptive kinesthesia and viscoelastic proprioception by learning kinetic features. This design enables passive adaptations to omni-directional interactions, visually captured by a miniature high-speed motion tracking system embedded inside for proprioceptive learning. The results show that the soft network can infer real-time 6D forces and torques with accuracies of 0.25/0.24/0.35 N and 0.025/0.034/0.006 Nm in dynamic interactions. We also incorporate viscoelasticity in proprioception during static adaptation by adding a creep and relaxation modifier to refine the predicted results. The proposed soft network combines simplicity in design, omni-adaptation, and proprioceptive sensing with high accuracy, making it a versatile solution for robotics at a low cost, with more than 1 million use cycles for tasks such as sensitive and competitive grasping and touch-based geometry reconstruction. This study offers new insights into vision-based proprioception for soft robots in adaptive grasping, soft manipulation, and human-robot interaction.

We submitted a journal article to Advanced Intelligent Systems[4] in 2023 to investigate the transferability of grasping knowledge from on-land to underwater via a vision-based soft robotic finger that learns 6D forces and torques (FT) using a Supervised Variational Autoencoder (SVAE). A high-framerate camera captures the whole-body deformations while a soft robotic finger interacts with physical objects on land and underwater. Results show that the trained SVAE model learned a series of latent representations of the soft mechanics transferrable from ground to water, presenting a superior adaptation to the changing environments against commercial FT sensors. Soft, delicate, and reactive grasping enabled by tactile intelligence enhances the gripper’s underwater interaction with improved reliability and robustness at a much-reduced cost, paving the path for learning-based intelligent grasping to support fundamental scientific discoveries in environmental and ocean research.

We submitted a journal article to Biomimetics in 2023, presenting a novel design of the soft finger that integrates inner vision with kinesthetic sensing to estimate object pose inspired by human fingers. The soft finger has a flexible skeleton and skin that adapts to different objects, and the skeleton deformations during interaction provide contact information obtained by the image from the inner camera. The proposed framework is an end-to-end method that uses raw images from soft fingers to estimate in-hand object pose. It consists of an encoder for kinesthetic information processing and an object pose and category estimator. The framework was tested on seven objects, achieving an impressive error of 2.02 mm and 11.34 degrees for pose error and 99.05% for classification.

We submitted another conference paper to 2024 IEEE International Conference on Robotics and Automation (ICRA2024), presenting a new vision-based proprioceptive soft finger with shape and touch estimating abilities. The finger design enhanced adaptability in bending, twisting, and enveloping during interactions, inherited and improved from the Fin Ray Effect. We developed vision-based proprioceptive sensing to estimate shape deformation and touch position. The approach to assessing shape is based on constrained geometric optimization, which approximates ArUco markers’ poses obtained by a monocular camera under the finger as aggregated multi-handles (AMHs) to drive the deformation of the finger mesh. A data-driven learning model is also proposed to estimate touch position with markers’ pose data, which achieves reliable results as R2 scores are 0.9657, 0.9464, and 0.9406 along the xy, and z directions. Another task for dynamic touch path sensing also shows the robustness of the proposed method. The soft finger’s superior proprioceptive sensing capability is ideal for precise and dexterous robotic manipulation tasks.

Currently, we are working on a journal article. The control of soft robots relies on accurate state estimation of complex soft deformations with infinite degrees of freedom. Compared to exteroceptive methods of state estimation, proprioceptive ways use embedded sensors such as optic fibers and strain sensors to reconstruct the 3-dimensional shape. They are more robust and transferrable to environmental change but suffer from complex fabrication and poor durability as the sensor and the delicate soft body are usually integrated inseparably. It remains a challenge to harvest both benefits. In this paper, we propose a novel approach of real-time state estimation of an omni-adaptive soft finger using a single in-finger monocular camera. The modularized design of the sensorized finger is sharable being easy to fabricate and sustainable as the sensor camera is detachable and reusable. We describe a method of volumetric discretized model of the soft finger and use the geometry constraints captured by the camera to find the optimal estimation of the deformed shape. The approach is benchmarked with a motion tracking system with sparse markers and a haptic device with dense measurements. Both results show state-of-the-art accuracies. More importantly, the state estimation is robust in both on-land and underwater environments as we demonstrate its usage for underwater object shape sensing.

[1] Linhan Yang, Xudong Han, Weijie Guo, Fang Wan, Jia Pan, and Chaoyang Song* (2021). “Learning-based Optoelectronically Innervated Tactile Finger for Rigid- Soft Interactive Grasping.” IEEE Robotics and Automation Letters, 6(2):3817-3824.
[2] Fang Wan, Xiaobo Liu, Ning Guo, Xudong Han, Feng Tian, and Chaoyang Song*. “Visual Learning Towards Soft Robot Force Control using a 3D Metamaterial with Differential Stiffness.” The 5th Conference on Robot Learning (CoRL). London, UK, on 8-11 November 2021. PMLR 164:1269-1278.
[3] Xiaobo Liu#, Xudong Han#, Wei Hong, Fang Wan*, and Chaoyang Song*. “Proprioceptive Learning with Soft Polyhedral Networks.” The International Journal of Robotics Research. (Under Review)
[4] Ning Guo, Xudong Han, Xiaobo Liu, Shuqiao Zhong, Zhiyuan Zhou, Jian Lin, Jiansheng Dai, Fang Wan*, and Chaoyang Song*. “Autoencoding a Soft Touch to Learn Grasping from On-land to Underwater.” Advanced Intelligent Systems. (Accepted)