2025-03-18

Title: A Survey of Direct Preference Optimization

Title: Fine-Tuning Diffusion Generative Models via Rich Preference Optimization

Title: SPECTra: Scalable Multi-Agent Reinforcement Learning with Permutation-Free Networks

Title: BACE-RUL: A Bi-directional Adversarial Network with Covariate Encoding for Machine Remaining Useful Life Prediction

Title: ECLARE: Efficient cross-planar learning for anisotropic resolution enhancement

Title: StyleMorpheus: A Style-Based 3D-Aware Morphable Face Model

Title: Towards a Unified Copernicus Foundation Model for Earth Vision

Title: Spatio-temporal Fourier Transformer (StFT) for Long-term Dynamics Prediction

Title: Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities

Title: Generating a Biometrically Unique and Realistic Iris Database

Title: CHOrD: Generation of Collision-Free, House-Scale, and Organized Digital Twins for 3D Indoor Scenes with Controllable Floor Plans and Optimal Layouts

Title: DecompDreamer: Advancing Structured 3D Asset Generation with Multi-Object Decomposition and Gaussian Splatting

Title: QDM: Quadtree-Based Region-Adaptive Sparse Diffusion Models for Efficient Image Super-Resolution

Title: Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art

Title: SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering

Title: Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System

Title: Robust Dataset Distillation by Matching Adversarial Trajectories

Title: E-SAM: Training-Free Segment Every Entity Model

Title: A Speech-to-Video Synthesis Approach Using Spatio-Temporal Diffusion for Vocal Tract MRI

Title: Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Title: DiffGAP: A Lightweight Diffusion Module in Contrastive Space for Bridging Cross-Model Gap

Title: Probabilistic Graph Circuits: Deep Generative Models for Tractable Probabilistic Inference over Graphs

Title: SEAL: Semantic Aware Image Watermarking

Title: LAPIG: Language Guided Projector Image Generation with Surface Adaptation and Stylization

Title: STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation

Title: Cross-Modal Diffusion for Biomechanical Dynamical Systems Through Local Manifold Alignment

Title: Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection

Title: Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs

Title: ResLPR: A LiDAR Data Restoration Network and Benchmark for Robust Place Recognition Against Weather Corruptions

Title: Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation

Title: VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting

Title: Pathology Image Restoration via Mixture of Prompts

Title: MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification

Title: LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Title: DPF-Net: Physical Imaging Model Embedded Data-Driven Underwater Image Enhancement

Title: Diffusion-based Synthetic Data Generation for Visible-Infrared Person Re-Identification

Title: BS-Mamba for Black-Soil Area Detection On the Qinghai-Tibetan Plateau

Title: Segment Any-Quality Images with Generative Latent Space Enhancement

Title: EditID: Training-Free Editable ID Customization for Text-to-Image Generation

Title: Towards Suturing World Models: Learning Predictive Models for Robotic Surgical Tasks

Title: SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs

Title: Debiasing Diffusion Model: Enhancing Fairness through Latent Representation Learning in Stable Diffusion Model

Title: Diffusion on Graph: Augmentation of Graph Structure for Node Classification

Title: GAN-Based Single-Stage Defense for Traffic Sign Classification Under Adversarial Patch Attack

Title: Personalize Anything for Free with Diffusion Transformer

Title: SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models

Title: Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Title: LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization

Title: UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing

Title: Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding

Title: Can LLMs Formally Reason as Abstract Interpreters for Program Analysis?

Title: MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization

Title: GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

Title: TinySQL: A Progressive Text-to-SQL Dataset for Mechanistic Interpretability Research

Title: VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis

Title: A Survey on Human Interaction Motion Generation

Title: Decouple to Reconstruct: High Quality UHD Restoration via Active Feature Disentanglement and Reversible Fusion

Title: TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image

Title: Improving Generalization of Universal Adversarial Perturbation via Dynamic Maximin Optimization

Title: A Reinforcement Learning-Driven Transformer GAN for Molecular Generation

Title: From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration

Title: PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior

Title: DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode

Title: GuideDog: A Real-World Egocentric Multimodal Dataset for Blind and Low-Vision Accessibility-Aware Guidance

Title: UniReg: Foundation Model for Controllable Medical Image Registration

Title: DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models

Title: MMLNB: Multi-Modal Learning for Neuroblastoma Subtyping Classification Assisted with Textual Description Generation

Title: AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction

Title: Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs

Title: L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

Title: Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction

Title: Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait

Title: Optimal Denoising in Score-Based Generative Models: The Role of Data Regularity

Title: Action tube generation by person query matching for spatio-temporal action detection

Title: Exploring 3D Activity Reasoning and Planning: From Implicit Human Intentions to Route-Aware Planning

Title: Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization

Title: TFDM: Time-Variant Frequency-Based Point Cloud Diffusion with Mamba

Title: Do Vision Models Develop Human-Like Progressive Difficulty Understanding?

Title: Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation

Title: DehazeMamba: SAR-guided Optical Remote Sensing Image Dehazing with Adaptive State Space Model

Title: Rethinking Image Evaluation in Super-Resolution

Title: DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Title: 3D Human Interaction Generation: A Survey

Title: ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation

Title: Patient-specific radiomic feature selection with reconstructed healthy persona of knee MR images

Title: From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral Perspective

Title: A super-resolution reconstruction method for lightweight building images based on an expanding feature modulation network

Title: Triad: Empowering LMM-based Anomaly Detection with Vision Expert-guided Visual Tokenizer and Manufacturing Process

Title: MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis

Title: MAME: Multidimensional Adaptive Metamer Exploration with Human Perceptual Feedback

Title: HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures

Title: Don't Judge Before You CLIP: A Unified Approach for Perceptual Tasks

Title: Graph Generative Models Evaluation with Masked Autoencoder

Title: Generative Gaussian Splatting: Generating 3D Scenes with Video Diffusion Priors

Title: $ϕ$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation

Title: Progressive Human Motion Generation Based on Text and Few Motion Frames

Title: Agents Play Thousands of 3D Video Games

Title: One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation

Title: MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Title: Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation

Title: BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing

Title: WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes

Title: Unified Autoregressive Visual Generation and Understanding with Continuous Tokens

Title: Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images