2026-03-16

Title: From Garbage to Gold: A Data-Architectural Theory of Predictive Robustness

Title: Synthetic Data Generation for Brain-Computer Interfaces: Overview, Benchmarking, and Future Directions

Title: VQQA: An Agentic Approach for Video Evaluation and Quality Improvement

Title: Sinkhorn-Drifting Generative Models

Title: Unleashing Video Language Models for Fine-grained HRCT Report Generation

Title: CalliMaster: Mastering Page-level Chinese Calligraphy via Layout-guided Spatial Planning

Title: RAW-Domain Degradation Models for Realistic Smartphone Super-Resolution

Title: Naïve PAINE: Lightweight Text-to-Image Generation Improvement with Prompt Evaluation

Title: MemRoPE: Training-Free Infinite Video Generation via Evolving Memory Tokens

Title: Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering

Title: Spatial Reasoning is Not a Free Lunch: A Controlled Study on LLaVA

Title: Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages

Title: AccelAes: Accelerating Diffusion Transformers for Training-Free Aesthetic-Enhanced Image Generation

Title: DINOLight: Robust Ambient Light Normalization with Self-supervised Visual Prior Integration

Title: Maximizing Incremental Information Entropy for Contrastive Learning

Title: Feynman: Knowledge-Infused Diagramming Agent for Scalable Visual Designs

Title: Prompt-Driven Lightweight Foundation Model for Instance Segmentation-Based Fault Detection in Freight Trains

Title: Adaptive Diffusion Posterior Sampling for Data and Model Fusion of Complex Nonlinear Dynamical Systems

Title: RoboStereo: Dual-Tower 4D Embodied World Models for Unified Policy Optimization

Title: From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space

Title: VGGT-World: Transforming VGGT into an Autoregressive Geometry World Model

Title: Vision Verification Enhanced Fusion of VLMs for Efficient Visual Reasoning

Title: RSONet: Region-guided Selective Optimization Network for RGB-T Salient Object Detection

Title: Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity

Title: SciDesignBench: Benchmarking and Improving Language Models for Scientific Inverse Design

Title: MoKus: Leveraging Cross-Modal Knowledge Transfer for Knowledge-Aware Concept Customization

Title: SLICE: Semantic Latent Injection via Compartmentalized Embedding for Image Watermarking

Title: Empowering Semantic-Sensitive Underwater Image Enhancement with VLM

Title: Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

Title: OARS: Process-Aware Online Alignment for Generative Real-World Image Super-Resolution

Title: coDrawAgents: A Multi-Agent Dialogue Framework for Compositional Image Generation

Title: Composing Driving Worlds through Disentangled Control for Adversarial Scenario Generation

Title: Dependency-Aware Parallel Decoding via Attention for Diffusion LLMs

Title: A Closed-Form Solution for Debiasing Vision-Language Models with Utility Guarantees Across Modalities and Tasks

Title: SAW: Toward a Surgical Action World Model via Controllable and Scalable Video Generation

Title: ESPIRE: A Diagnostic Benchmark for Embodied Spatial Reasoning of Vision-Language Models

Title: 3DTCR: A Physics-Based Generative Framework for Vortex-Following 3D Reconstruction to Improve Tropical Cyclone Intensity Forecasting

Title: Topo-R1: Detecting Topological Anomalies via Vision-Language Models

Title: Reference-Free Image Quality Assessment for Virtual Try-On via Human Feedback

Title: GeoChemAD: Benchmarking Unsupervised Geochemical Anomaly Detection for Mineral Exploration

Title: Rooftop Wind Field Reconstruction Using Sparse Sensors: From Deterministic to Generative Learning Methods

Title: V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration

Title: FDeID-Toolbox: Face De-Identification Toolbox

Title: Towards Spatio-Temporal World Scene Graph Generation from Monocular Videos

Title: Visual-ERM: Reward Modeling for Visual Equivalence

Title: PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization