Publications
Please also check the Google Scholar for a comprehensive list.
2025
- EMNLPMETok: Multi-Stage Event-based Token Compression for Efficient Long Video UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP) Main, 2025
- COLMSupposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMsCOLM, 2025
- ACLMultimodal pragmatic jailbreak on text-to-image modelsACL, 2025
- WACVCan Multimodal Large Language Models Truly Perform Multimodal In-Context Learning?In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2025
2024
- EMNLPVisual question decomposition on multimodal large language modelsConference on Empirical Methods in Natural Language Processing (EMNLP) Findings, 2024
- COLMStop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial ImagesIn Conference on Language Modeling (COLM) 2024 , 2024
- SET LLM @ ICLRRed Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?In ICLR 2024 Workshop on Secure and Trustworthy Large Language Models , 2024
- arXivPERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert ModelarXiv preprint arXiv:2411.08212, 2024
2023
- NeurIPSBenchmarking robustness of adaptation methods on pre-trained vision-language modelsIn Conference on Neural Information Processing Systems (NeruIPS) , 2023
- arXivA Systematic Survey of Prompt Engineering on Vision-Language Foundation ModelsarXiv preprint arXiv:2307.12980, 2023
2022
- arXiv