I am a fifth-year PhD student at UNC, Chapel Hill. I currently work in the MURGe-Lab, and am advised by Mohit Bansal. My research interests are in the areas of Deep Learning, Machine Learning, and Computer Vision. Recently, I am particularly interested in multi-modal learning and efficient fine-tuning, where my goal is to train large models with limited resources and deploy them to benefit human's daily life. Before joining MURGe-Lab, I also worked with Colin Raffel and Marc Niethammer.
I also spent time working as a research scientist intern in tech company in summers. In 2024 Summer, I interned at Google with Otilia Stretcu on VLM reasoning. In 2023 summer, I interned at Meta with Abhimanyu Dubey, Filip Radenovic and Abhishek Kadian on text-to-image generation. In 2022 summer, I worked at Microsoft with Linjie Li, Kevin Lin and Zhe Gan on VL model merging.
Nov 2024: "DAM" is accepted to WACV 2025.
Oct 2024: "SELMA" is accepted to NeurIPS 2024.
Oct 2024: Preprint of "Glider" is online.
Jan 2024: Two papers, "ECoFLaP" and "MC-SMoE", are accepted to ICLR 2024.
Oct 2023: "An Empirical Study of Multimodal Model Merging" is accepted to EMNLP Findings 2023.
Oct 2023: Preprints of "ECoFLaP" and "MC-SMoE" are online.
July 2023: "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" is accepted to ICCV 2023.
May 2023: Start the research internship at Meta.
April 2023: A preprint of "An Empirical Study of Multimodal Model Merging" is online.
Feb 2023: "Vision Transformers are Parameter-Efficient Audio-Visual Learners" is accepted to CVPR 2023.