2025-08-11

Title: UnGuide: Learning to Forget with LoRA-Guided Diffusion Models

Title: MAISI-v2: Accelerated 3D High-Resolution Medical Image Synthesis with Rectified Flow and Region-specific Contrastive Loss

Title: HOLODECK 2.0: Vision-Language-Guided 3D World Generation with Editing

Title: A 3DGS-Diffusion Self-Supervised Framework for Normal Estimation from a Single Image

Title: Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Title: Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal

Title: Optimizing Prompt Sequences using Monte Carlo Tree Search for LLM-Based Optimization

Title: Improved Sub-Visible Particle Classification in Flow Imaging Microscopy via Generative AI-Based Image Synthesis

Title: InstantEdit: Text-Guided Few-Step Image Editing with Piecewise Rectified Flow

Title: Fourier-VLM: Compressing Vision Tokens in the Frequency Domain for Large Vision-Language Models

Title: NEP: Autoregressive Image Editing via Next Editing Token Prediction

Title: VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning

Title: Towards MR-Based Trochleoplasty Planning

Title: DreamVE: Unified Instruction-based Image and Video Editing

Title: SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment

Title: Q-CLIP: Unleashing the Power of Vision-Language Models for Video Quality Assessment through Unified Cross-Modal Adaptation

Title: E-React: Towards Emotionally Controlled Synthesis of Human Reactions

Title: UGD-IML: A Unified Generative Diffusion-based Framework for Constrained and Unconstrained Image Manipulation Localization

Title: SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning

Title: DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera

Title: Text-guided Visual Prompt DINO for Generic Segmentation

Title: Improving Diagnostic Accuracy for Oral Cancer with inpainting Synthesis Lesions Generated Using Diffusion Models

Title: An Interpretable Multi-Plane Fusion Framework With Kolmogorov-Arnold Network Guided Attention Enhancement for Alzheimer's Disease Diagnosis

Title: Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment

Title: Synthetic Data-Driven Multi-Architecture Framework for Automated Polyp Segmentation Through Integrated Detection and Mask Generation

Title: MA-CBP: A Criminal Behavior Prediction Framework Based on Multi-Agent Asynchronous Collaboration

Title: PA-HOI: A Physics-Aware Human and Object Interaction Dataset

Title: Towards Unified Image Deblurring using a Mixture-of-Experts Decoder

Title: Synthetic Data Generation and Differential Privacy using Tensor Networks' Matrix Product States (MPS)

Title: SIFThinker: Spatially-Aware Image Focus for Visual Reasoning

Title: OM2P: Offline Multi-Agent Mean-Flow Policy

Title: Can Diffusion Models Bridge the Domain Gap in Cardiac MR Imaging?

Title: Structural Equation-VAE: Disentangled Latent Representations for Tabular Data

Title: Aligning Effective Tokens with Video Anomaly in Large Language Models

Title: ActivityDiff: A diffusion model with Positive and Negative Activity Guidance for De Novo Drug Design

Title: End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation

Title: FVGen: Accelerating Novel-View Synthesis with Adversarial Video Diffusion Distillation

Title: A Classification-Aware Super-Resolution Framework for Ship Targets in SAR Imagery

Title: SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation

Title: WGAST: Weakly-Supervised Generative Network for Daily 10 m Land Surface Temperature Estimation via Spatio-Temporal Fusion

Title: Effective Training Data Synthesis for Improving MLLM Chart Understanding

Title: LightSwitch: Multi-view Relighting with Material-guided Diffusion