2025-05-23

Title: Generative AI for Autonomous Driving: A Review

Title: SCENIR: Visual Semantic Clarity through Unsupervised Scene Graph Retrieval

Title: Satellites Reveal Mobility: A Commuting Origin-destination Flow Generator for Global Cities

Title: Challenger: Affordable Adversarial Driving Video Generation

Title: Is (Selective) Round-To-Nearest Quantization All You Need?

Title: MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding

Title: VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance

Title: Super-Resolution with Structured Motion

Title: Position: Agentic Systems Constitute a Key Component of Next-Generation Intelligent Image Processing

Title: Toward Theoretical Insights into Diffusion Trajectory Distillation via Operator Merging

Title: CP-LLM: Context and Pixel Aware Large Language Model for Video Quality Assessment

Title: Learning better representations for crowded pedestrians in offboard LiDAR-camera 3D tracking-by-detection

Title: An Exploratory Approach Towards Investigating and Explaining Vision Transformer and Transfer Learning for Brain Disease Detection

Title: Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond

Title: Bidirectional Variational Autoencoders

Title: A Survey of Large Language Models for Text-Guided Molecular Discovery: from Molecule Generation to Optimization

Title: Tools in the Loop: Quantifying Uncertainty of LLM Question Answering Systems That Use Tools

Title: Scalable Graph Generative Modeling via Substructure Sequences

Title: Breaking Complexity Barriers: High-Resolution Image Restoration with Rank Enhanced Linear Attention

Title: Deep Learning-Driven Ultra-High-Definition Image Restoration: A Survey

Title: Erased or Dormant? Rethinking Concept Erasure Through Reversibility

Title: Understanding Generative AI Capabilities in Everyday Image Editing Tasks

Title: DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Title: Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models

Title: Paired and Unpaired Image to Image Translation using Generative Adversarial Networks

Title: NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

Title: SuperPure: Efficient Purification of Localized and Distributed Adversarial Patches via Super-Resolution GAN Models

Title: TensorAR: Refinement is All You Need in Autoregressive Image Generation

Title: ChemMLLM: Chemical Multimodal Large Language Model

Title: FPQVAR: Floating Point Quantization for Visual Autoregressive Model with FPGA Hardware Co-design

Title: A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules

Title: AdvReal: Adversarial Patch Generation Framework with Application to Adversarial Safety Evaluation of Object Detection Systems

Title: Pose-invariant face recognition via feature-space pose frontalization

Title: Joint Flow And Feature Refinement Using Attention For Video Restoration

Title: MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM

Title: Consistent World Models via Foresight Diffusion

Title: Clear Nights Ahead: Towards Multi-Weather Nighttime Image Restoration

Title: Neighbour-Driven Gaussian Process Variational Autoencoders for Scalable Structured Latent Modelling

Title: InspectionV3: Enhancing Tobacco Quality Assessment with Deep Convolutional Neural Networks for Automated Workshop Management

Title: ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation

Title: Beyond Face Swapping: A Diffusion-Based Digital Human Benchmark for Multimodal Deepfake Detection

Title: Joint Relational Database Generation via Graph-Conditional Diffusion Models

Title: HOFT: Householder Orthogonal Fine-tuning

Title: SHaDe: Compact and Consistent Dynamic 3D Reconstruction via Tri-Plane Deformation and Latent Diffusion

Title: Incremental Sequence Classification with Temporal Consistency

Title: M2SVid: End-to-End Inpainting and Refinement for Monocular-to-Stereo Video Conversion

Title: Temporal Object Captioning for Street Scene Videos from LiDAR Tracks

Title: MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation

Title: CausalDynamics: A large-scale benchmark for structural discovery of dynamical causal models

Title: Grounding Chest X-Ray Visual Question Answering with Generated Radiology Reports

Title: On the Out-of-Distribution Generalization of Self-Supervised Learning

Title: Semantic Compression of 3D Objects for Open and Collaborative Virtual Worlds

Title: One-Step Diffusion-Based Image Compression with Semantic Distillation

Title: KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models

Title: Masked Conditioning for Deep Generative Models

Title: Forward-only Diffusion Probabilistic Models

Title: Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning

Title: Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation

Title: Learning Flexible Forward Trajectories for Masked Molecular Diffusion

Title: Cohort-Based Active Modality Acquisition

Title: REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training

Title: V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation

Title: A modular framework for automated evaluation of procedural content generation in serious games with deep reinforcement learning agents

Title: Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

Title: Perceptual Quality Assessment for Embodied AI

Title: Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts

Title: LaViDa: A Large Diffusion Language Model for Multimodal Understanding

Title: Redefining Clustered Federated Learning for System Identification: The Path of ClusterCraft

Title: GCAL: Adapting Graph Models to Evolving Domain Shifts

Title: Conditional Panoramic Image Generation via Masked Autoregressive Modeling

Title: Training-Free Efficient Video Generation via Dynamic Token Carving

Title: DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?

Title: Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype

Title: MixAT: Combining Continuous and Discrete Adversarial Training for LLMs

Title: Bigger Isn't Always Memorizing: Early Stopping Overparameterized Diffusion Models

Title: Creatively Upscaling Images with Global-Regional Priors

Title: Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On

Title: Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Title: Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space

Title: When Are Concepts Erased From Diffusion Models?

Title: Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO

Title: GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning