Tao Zhong

Hi! I am a second-year Ph.D. student at Princeton University, where I am supervised by Prof. Christine Allen-Blanchette. I received my B.A.Sc in Engineering Science from the University of Toronto, under the supervision of Prof. Animesh Garg.

In the past, I have interned at Noah's Ark Lab working with Prof. Yang Wang on computer vision and domain generalization. I have also spent time working with Prof. Huihuan Qian at AIRS and CUHK(SZ), focusing on marine robotics.

Feel free to check out my CV or send me an e-mail if you want to connect.

Email | CV | Google Scholar | Github | LinkedIn

News

• Jan 2024: VDPG accepted by ICLR 2024. • Mar 2023: I will be a PhD student at Princeton University in the incoming fall. Excited to begin the new journey. • Jan 2023: Fast-Grasp'D accepted by ICRA 2023. • Sep 2022: Meta-DMoE accepted by NeurIPS 2022.

News

Publications

My primary research interests lie at the intersection of robotics and computer vision.

Geometric Algebra Grasp Diffusion for Dexterous Manipulators

Tao Zhong and Christine Allen-Blanchette

In Submission to the IEEE International Conference on Robotics and Automation (ICRA), 2025.

Presented at the Equivariant Robotics Workshop at IROS 2024

We propose a novel framework for dexterous grasp generation that leverages geometric algebra representations to enforce equivariance to SE(3) transformations. By encoding the SE(3) symmetry constraint directly into the architecture, our method improves data and parameter efficiency, while enabling robust grasp generation across diverse object poses. Additionally, we incorporate a differentiable physics-informed refinement layer, which ensures generated grasps are physically plausible and stable. Extensive experiments demonstrate the model's superior performance in generalization, stability, and adaptability compared to existing methods.

/ paper (coming soon) / project page / workshop paper

Adapting to Distribution Shift by Visual Domain Prompt Generation

Zhixiang Chi*, Li Gu*, Tao Zhong, Huan Liu, Yuanhao Yu, Konstantinos N Plataniotis, Yang Wang

Proceedings of the International Conference on Learning Representations (ICLR), 2024.

In this paper, we aim to adapt a model at test-time using a few unlabeled data to address distribution shifts. In this setting, extracting the domain knowledge from a limited amount of data is challenging. To improve such a process, it is crucial to utilize correlated information from pre-trained backbones and source domains. Previous studies fail to utilize recent foundation models with strong out-of-distribution generalization. Additionally, domain-centric designs are not flavored in their works. Furthermore, they employ the process of modelling source domains and the process of learning to adapt independently into disjoint training stages. In this work, we propose an approach on top of the pre-computed features of the foundation model. Specifically, we build a knowledge bank to learn the transferable knowledge from source domains. Conditioned on few-shot target data, we introduce a domain prompt generator to condense the knowledge bank into a domain-specific prompt. The domain prompt then directs the visual features towards a particular domain via a guidance module. Moreover, we propose a domain-aware contrastive loss and employ meta-learning to facilitate domain knowledge extraction. Extensive experiments are conducted to validate the domain knowledge extraction. The proposed method outperforms previous work significantly on 5 large-scale benchmarks including WILDS and DomainNet.

/ paper / arxiv / project page / code

Fast-Grasp'D: Dexterous Multi-finger Grasp Generation Through Differentiable Simulation

Dylan Turpin, Tao Zhong, Shutong Zhang, Guanglei Zhu, Eric Heiden, Miles Macklin, Stavros Tsogkas, Sven Dickinson, Animesh Garg

Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2023.

Multi-finger grasping relies on high quality training data, which is hard to obtain: human data is hard to transfer, and synthetic data relies on simplifying assumptions that reduce grasp quality. By making grasp simulation differentiable, and contact dynamics amenable to gradient-based optimization, we accelerate the search for high-quality grasps with fewer limiting assumptions. We present Grasp'D-1M: a large-scale dataset for multi-finger robotic grasping, synthesized with Fast-Grasp'D, a novel differentiable grasping simulator. Grasp'D-1M contains one million training examples for three robotic hands (three, four and five-fingered), each with multimodal visual inputs (RGB+depth+segmentation, available in mono and stereo). Grasp synthesis with Fast-Grasp'D is 10x faster than GraspIt! and 20x faster than the prior Grasp'D differentiable simulator. Generated grasps are more stable and contact-rich than GraspIt! grasps, regardless of the distance threshold used for contact generation. We validate the usefulness of our dataset by retraining an existing vision-based grasping pipeline on Grasp'D-1M, and showing a dramatic increase in model performance, predicting grasps with 30% more contact, a 33% higher epsilon metric, and 35% lower simulated displacement.

/ paper / arxiv / project page

Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts

Tao Zhong*, Zhixiang Chi*, Li Gu*, Yang Wang, Yuanhao Yu, Jin Tang

Advances in Neural Information Processing Systems (NeurIPS), 2022.

In this paper, we tackle the problem of domain shift. Most existing methods perform training on multiple source domains using a single model, and the same trained model is used on all unseen target domains. Such solutions are sub-optimal as each target domain exhibits its own speciality, which is not adapted. Furthermore, expecting the single-model training to learn extensive knowledge from the multiple source domains is counterintuitive. The model is more biased toward learning only domain-invariant features and may result in negative knowledge transfer. In this work, we propose a novel framework for unsupervised test-time adaptation, which is formulated as a knowledge distillation process to address domain shift. Specifically, we incorporate Mixture-of-Experts (MoE) as teachers, where each expert is separately trained on different source domains to maximize their speciality. Given a test-time target domain, a small set of unlabeled data is sampled to query the knowledge from MoE. As the source domains are correlated to the target domains, a transformer-based aggregator then combines the domain knowledge by examining the interconnection among them. The output is treated as a supervision signal to adapt a student prediction network toward the target domain. We further employ meta-learning to enforce the aggregator to distill positive knowledge and the student network to achieve fast adaptation. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art and validates the effectiveness of each proposed component.

/ paper / arxiv / code

Education

Princeton University

Ph.D. in Mechanical and Aerospace Engineering (Robotics Track)

2023.08 - Present

cGPA: 4.0

University of Toronto

B.A.Sc. in Engineering Science (Major in Robotics Engineering) with High Honours

2018.09 - 2023.06

cGPA: 3.81

Dean's List: All terms

Last updated Oct 2024