2025-06-03

Title: On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning

Title: MOFGPT: Generative Design of Metal-Organic Frameworks using Language Models

Title: Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes

Title: Beyond Semantic Entropy: Boosting LLM Uncertainty Quantification with Pairwise Semantic Similarity

Title: Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model

Title: Improving Protein Sequence Design through Designability Preference Optimization

Title: Inference-Time Alignment of Diffusion Models with Evolutionary Algorithms

Title: Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework

Title: Latent Guidance in Diffusion Models for Perceptual Evaluations

Title: Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation

Title: iDPA: Instance Decoupled Prompt Attention for Incremental Medical Object Detection

Title: Latent Wavelet Diffusion: Enabling 4K Image Synthesis for Free

Title: RLAE: Reinforcement Learning-Assisted Ensemble for LLMs

Title: Comparing Traditional and Reinforcement-Learning Methods for Energy Storage Control

Title: Imputation of Missing Data in Smooth Pursuit Eye Movements Using a Self-Attention-based Deep Learning Approach

Title: ORAN-GUIDE: RAG-Driven Prompt Learning for LLM-Augmented Reinforcement Learning in O-RAN Network Slicing

Title: Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control

Title: SatDreamer360: Geometry Consistent Street-View Video Generation from Satellite Imagery

Title: ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Title: Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining

Title: Video Signature: In-generation Watermarking for Latent Video Diffusion Models

Title: Differential Privacy for Deep Learning in Medicine

Title: SafeTuneBed: A Toolkit for Benchmarking LLM Safety Alignment in Fine-Tuning

Title: Concept-Centric Token Interpretation for Vector-Quantized Generative Models

Title: RelDiff: Relational Data Generative Modeling with Graph-Based Diffusion Models

Title: ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary

Title: Manipulating 3D Molecules in a Fixed-Dimensional SE(3)-Equivariant Latent Space

Title: Aiding Medical Diagnosis through Image Synthesis and Classification

Title: HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language Models

Title: QuantFace: Low-Bit Post-Training Quantization for One-Step Diffusion Face Restoration

Title: SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

Title: FourierFlow: Frequency-aware Flow Matching for Generative Turbulence Modeling

Title: Local Manifold Approximation and Projection for Manifold-Aware Diffusion Planning

Title: Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection

Title: State-Covering Trajectory Stitching for Diffusion Planners

Title: DS-VTON: High-Quality Virtual Try-on via Disentangled Dual-Scale Generation

Title: 3D Skeleton-Based Action Recognition: A Review

Title: Deformable registration and generative modelling of aortic anatomies by auto-decoders and neural ODEs

Title: TIGeR: Text-Instructed Generation and Refinement for Template-Free Hand-Object Interaction

Title: Camera Trajectory Generation: A Comprehensive Survey of Methods, Metrics, and Future Directions

Title: Quantization-based Bounds on the Wasserstein Metric

Title: IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection

Title: GOBench: Benchmarking Geometric Optics Generation and Understanding of MLLMs

Title: Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models

Title: Pseudo-Labeling Driven Refinement of Benchmark Object Detection Datasets via Analysis of Learning Patterns

Title: Self-supervised ControlNet with Spatio-Temporal Mamba for Real-world Video Super-resolution

Title: DeepVerse: 4D Autoregressive Video Generation as a World Model

Title: Reconsidering LLM Uncertainty Estimation Methods in the Wild

Title: Revolutionizing Radiology Workflow with Factual and Efficient CXR Report Generation

Title: Neuro-Symbolic Generative Diffusion Models for Physically Grounded, Robust, and Safe Generation

Title: FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation

Title: Earley-Driven Dynamic Pruning for Efficient Structured Decoding

Title: FORT: Forward-Only Regression Training of Normalizing Flows

Title: Bridging Quantum and Classical Computing in Drug Design: Architecture Principles for Improved Molecule Generation

Title: ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding

Title: Recent Developments in GNNs for Drug Discovery

Title: $Ψ$-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models

Title: Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation

Title: NoiseAR: AutoRegressing Initial Noise Prior for Diffusion Models

Title: A 2-Stage Model for Vehicle Class and Orientation Detection with Photo-Realistic Image Generation

Title: TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal Discovery

Title: Synthetic Data Augmentation using Pre-trained Diffusion Models for Long-tailed Food Image Classification

Title: Incentivizing LLMs to Self-Verify Their Answers

Title: PointT2I: LLM-based text-to-image generation via keypoints

Title: RadarSplat: Radar Gaussian Splatting for High-Fidelity Data Synthesis and 3D Reconstruction of Autonomous Driving Scenes

Title: Playing with Transformer at 30+ FPS via Next-Frame Diffusion

Title: NTIRE 2025 the 2nd Restore Any Image Model (RAIM) in the Wild Challenge

Title: Self-supervised Latent Space Optimization with Nebula Variational Coding

Title: DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing

Title: DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion

Title: Towards Scalable Video Anomaly Retrieval: A Synthetic Video-Text Benchmark

Title: Feature-aware Hypergraph Generation via Next-Scale Prediction

Title: Unlocking Aha Moments via Reinforcement Learning: Advancing Collaborative Visual Comprehension and Generation

Title: Automatic Stage Lighting Control: Is it a Rule-Driven Process or Generative Task?

Title: FDSG: Forecasting Dynamic Scene Graphs

Title: Efficiency without Compromise: CLIP-aided Text-to-Image GANs with Increased Diversity

Title: Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment

Title: Beyond Diagonal Covariance: Flexible Posterior VAEs via Free-Form Injective Flows

Title: G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models

Title: Adaptive Destruction Processes for Diffusion Samplers

Title: LongDWM: Cross-Granularity Distillation for Building a Long-Term Driving World Model

Title: HOSIG: Full-Body Human-Object-Scene Interaction Generation with Hierarchical Scene Perception

Title: EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models

Title: Minimal Impact ControlNet: Advancing Multi-ControlNet Integration

Title: VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking

Title: STORM: Benchmarking Visual Rating of MLLMs with a Comprehensive Ordinal Regression Dataset

Title: Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks

Title: Federated Gaussian Mixture Models

Title: Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability

Title: WorldExplorer: Towards Generating Fully Navigable 3D Scenes

Title: OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation

Title: GSCodec Studio: A Modular Framework for Gaussian Splat Compression

Title: SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Title: MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs

Title: ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

Title: SMOTE-DP: Improving Privacy-Utility Tradeoff with Synthetic Data

Title: Elucidating the representation of images within an unconditional diffusion model denoiser

Title: TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation

Title: Low-Rank Head Avatar Personalization with Registers

Title: Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

Title: IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout

Title: Dual-Process Image Generation