2025-04-24

Title: Multimodal Large Language Models for Enhanced Traffic Safety: A Comprehensive Review and Future Trends

Title: Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching

Title: PixelWeb: The First Web GUI Dataset with Pixel-Wise Labels

Title: Cross Paradigm Representation and Alignment Transformer for Image Deraining

Title: A Comprehensive Survey of Synthetic Tabular Data Generation

Title: Streetscape Analysis with Generative AI (SAGAI): Vision-Language Assessment and Mapping of Urban Scenes

Title: Beyond Anonymization: Object Scrubbing for Privacy-Preserving 2D and 3D Vision Tasks

Title: Unified Molecule Generation and Property Prediction

Title: Hyper-Transforming Latent Diffusion Models

Title: EHGCN: Hierarchical Euclidean-Hyperbolic Fusion via Motion-Aware GCN for Hybrid Event Stream Perception

Title: Dual-Camera All-in-Focus Neural Radiance Fields

Title: RouteWinFormer: A Route-Window Transformer for Middle-range Attention in Image Restoration

Title: Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning

Title: PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning

Title: V$^2$R-Bench: Holistically Evaluating LVLM Robustness to Fundamental Visual Variations

Title: Feature Mixing Approach for Detecting Intraoperative Adverse Events in Laparoscopic Roux-en-Y Gastric Bypass Surgery

Title: Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism

Title: Towards Explainable AI: Multi-Modal Transformer for Video-based Image Description Generation

Title: Process Reward Models That Think

Title: Evaluating Autoencoders for Parametric and Invertible Multidimensional Projections

Title: Exploring How LLMs Capture and Represent Domain-Specific Knowledge

Title: BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation

Title: DreamO: A Unified Framework for Image Customization

Title: Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light

Title: Procedural Dataset Generation for Zero-Shot Stereo Matching