Xiangyu YueEmail: xyyue [at] ie.cuhk.edu.hk | 
        |
| 
             
                I am currently an Assistant Professor in the Department of Information Engineering at  Chinese University of Hong Kong, with the Multimedia Lab (MMLab). 
                  I received my Ph.D. from Electrical Engineering and Computer Science at University of California, Berkeley, working with Prof. Alberto Sangiovanni Vincentelli and Prof. Kurt Keutzer at Berkeley AI Research. I am broadly interested in various areas including but not limited to: multi-modal learning, multi-modal LLMs, generative models, transfer learning, AI for Science, etc. 
                 
          Prior to Berkeley, I received MS degree from Stanford University and B.S. degree from Nanjing University. I have spent time at Google Research, Google [x] Robotics, Baidu AI Research, and Tencent AI Lab. I received the prestigious Lotfi A. Zadeh Award for my research. *NEW* I have multiple fully-funded Ph.D. / Mphil (2026) positions all year round. I have Post-Doc, RA, visiting student and intern positions available as well. Feel free to Email me if you are interested. (Please also highlight if you have other funding sources or support.) Google Scholar | LinkedIn | Twitter | DBLP | 
          
            
              
             | 
        
                 
               | 
              
                 
                  
                    OneLLM: One Framework to Align All Modalities with Language
                  
                    | 
            
                 
               | 
              
                 
                  
                    Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
                  
                    | 
            
                 
               | 
              
                 
                  
                    UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
                  
                    | 
            
                 
               | 
              
                 
                  
                    Meta-Transformer: A Unified Framework for Multimodal Learning
                  
                    | 
            
                 
               | 
              
                 
                  
                    LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
                  
                    | 
            
               
             | 
            
               
                
                  Space Engage: Collaborative Space Supervision for Contrastive-based Semi-Supervised Semantic Segmentation
                
                  | 
          
               
             | 
            
               
                
                  Beating Backdoor Attack at Its Own Game
                
                  | 
          
             
           | 
          
             
              
                Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
              
                | 
        
               
             | 
            
               
                
                  Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
                
                  | 
          
               
             | 
            
               
                
                  MLSeg: Image and Video Segmentation as Multi-Label Classification and Selected-Label Pixel Classification
                
                  | 
          
               
             | 
            
               
                
                  Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data
                
                  | 
          
               
             | 
            
               
                
                  Self-Supervised Pretraining Improves Self-Supervised Pretraining
                
                  | 
          
               
             | 
            
               
                
                  Multi-source Few-shot Domain Adaptation
                
                  | 
          
               
             | 
            
               
                
                  Unsupervised Point Cloud Pre-Training via View-Point Occlusion, Completion
                
                  | 
          
               
             | 
            
               
                
                  On Ensemble Methods for Long-Tailed Recognition
                
                  | 
          
               
             | 
            
               
                
                  AugPrune: Robust Network Pruning via Augmented Data
                
                  | 
          
               
             | 
            
               
                
                  Scene-aware Learning Network for Radar Object Detection
                
                  | 
          
               
             | 
            
               
                
                  Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation
                
                  | 
          
               
             | 
            
               
                
                  Curriculum Cyclegan for Textual Sentiment Domain Adaptation with Multiple Sources
                
                  | 
          
               
             | 
            
               
                
                  Emotional Semantics-preserved and Feature-aligned CycleGAN for Visual Emotion Adaptation
                
                  | 
          
               
             | 
            
               
                
                  A Review of Single-Source Deep Unsupervised Visual Domain Adaptation
                
                  | 
          
               
             | 
            
               
                
                  PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation
                
                  | 
          
               
             | 
            
               
                
                  Scenic: a Language for Scenario Specification and Scene Generation
                
                  | 
          
               
             | 
            
               
                
                  Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data
                
                  | 
          
               
             | 
            
               
                
                  Multi-source Domain Adaptation for Semantic Segmentation
                
                  | 
          
               
             | 
            
               
                
                  Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud
                
                  | 
          
               
             | 
            
               
                
                  Counterexample-guided Data Augmentation
                
                  | 
          
               
             | 
            
               
                
                  Shift: A Zero-flop, Zero-parameter Alternative to Apatial Convolutions
                
                  | 
          
               
             | 
            
               
                
                  A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving
                
                  | 
          
               
             | 
            
               
                
                  Formal Specification for Deep Neural Networks
                
                  | 
          
               
             | 
            
               
                
                  SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-time Road-object Segmentation from 3d LiDAR Point Cloud
                
                  |