Publications
Selected papers are listed first. Equal contribution is marked with *.
Selected Publications
TimeCausality: Evaluating the Causal Ability in Time Dimension for Vision Language Models
Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition
Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
Preprints & More
From Pixels to Concepts: Do Segmentation Models Understand What They Segment?
VideoVerse: How Far is Your T2V Generator from a World Model?
PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models
TIIF-Bench: How Does Your T2I Model Follow Your Instructions?
A Novel College Entrance Filling Recommendation Algorithm Based on Score Line Prediction and Multi-feature Fusion
E3ID: An Efficient End to End Person Search Model
A Continual Learning Paradigm for Non-differentiable Visual Programming Frameworks on Visual Reasoning Tasks
