2025-03-11

Title: What I cannot execute, I do not understand: Training and Evaluating LLMs on Program Execution Traces

Title: Evaluation of Missing Data Imputation for Time Series Without Ground Truth

Title: Slim attention: cut your context memory in half without loss of accuracy -- K-cache is all you need for MHA

Title: Zero-shot Medical Event Prediction Using a Generative Pre-trained Transformer on Electronic Health Records

Title: A Survey on Tabular Data Generation: Utility, Alignment, Fidelity, Privacy, and Beyond

Title: Validating LLM-as-a-Judge Systems in the Absence of Gold Labels

Title: Generative Multi-Agent Q-Learning for Policy Optimization: Decentralized Wireless Networks

Title: MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

Title: Learning-Order Autoregressive Models with Application to Molecular Graph Generation

Title: Integrating Frequency-Domain Representations with Low-Rank Adaptation in Vision-Language Models

Title: Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity

Title: DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

Title: Attention-Based Synthetic Data Generation for Calibration-Enhanced Survival Analysis: A Case Study for Chronic Kidney Disease Using Electronic Health Records

Title: Unlocking Pretrained LLMs for Motion-Related Multimodal Generation: A Fine-Tuning Approach to Unify Diffusion and Next-Token Prediction

Title: Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Flexible and Effective Paradigm

Title: USP: Unified Self-Supervised Pretraining for Image Generation and Understanding

Title: X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation

Title: GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation

Title: Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model

Title: VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models

Title: BioMoDiffuse: Physics-Guided Biomechanical Diffusion for Controllable and Authentic Human Motion Synthesis

Title: ROCM: RLHF on consistency models

Title: Removing Multiple Hybrid Adverse Weather in Video via a Unified Model

Title: Explainable Synthetic Image Detection through Diffusion Timestep Ensembling

Title: GraphGen+: Advancing Distributed Subgraph Generation and Graph Learning On Industrial Graphs

Title: WaveStitch: Flexible and Fast Conditional Time Series Generation with Diffusion Models

Title: Single Domain Generalization with Adversarial Memory

Title: Text2Story: Advancing Video Storytelling with Text Guidance

Title: Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation

Title: Learning to Unlearn while Retaining: Combating Gradient Conflicts in Machine Unlearning

Title: Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning

Title: GIN-Graph: A Generative Interpretation Network for Model-Level Explanation of Graph Neural Networks

Title: Generative Video Bi-flow

Title: EPR-GAIL: An EPR-Enhanced Hierarchical Imitation Learning Framework to Simulate Complex User Consumption Behaviors

Title: Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter

Title: Consistent Image Layout Editing with Diffusion Models

Title: Federated Learning for Diffusion Models

Title: Pre-Training Meta-Rule Selection Policy for Visual Generative Abductive Learning

Title: CtrTab: Tabular Data Synthesis with High-Dimensional and Limited Data

Title: Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning

Title: SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts

Title: A Mesh Is Worth 512 Numbers: Spectral-domain Diffusion Modeling for High-dimension Shape Generation

Title: ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis

Title: DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

Title: Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation

Title: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation

Title: One-Step Diffusion Model for Image Motion-Deblurring

Title: ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Title: QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation

Title: Generative modelling with jump-diffusions

Title: TR-DQ: Time-Rotation Diffusion Quantization

Title: Future-Aware Interaction Network For Motion Forecasting

Title: Human Cognition Inspired RAG with Knowledge Graph for Complex Problem Solving

Title: Conceptrol: Concept Control of Zero-shot Personalized Image Generation

Title: Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling

Title: Synthetic Data Generation for Minimum-Exposure Navigation in a Time-Varying Environment using Generative AI Models

Title: Dynamic Updates for Language Adaptation in Visual-Language Tracking

Title: Chameleon: On the Scene Diversity and Domain Variety of AI-Generated Videos Detection

Title: Towards More Accurate Personalized Image Generation: Addressing Overfitting and Evaluation Bias

Title: Adding Additional Control to One-Step Diffusion with Joint Distribution Matching

Title: AxisPose: Model-Free Matching-Free Single-Shot 6D Object Pose Estimation via Axis Generation

Title: Emulating Self-attention with Convolution for Efficient Image Super-Resolution

Title: Learning Few-Step Diffusion Models by Trajectory Distribution Matching

Title: REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints

Title: PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation

Title: UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion

Title: Unsupervised Multi-Clustering and Decision-Making Strategies for 4D-STEM Orientation Mapping

Title: Color Alignment in Diffusion

Title: DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask Diffusion

Title: Primal-Dual Sample Complexity Bounds for Constrained Markov Decision Processes with Multiple Constraints

Title: SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation

Title: GenDR: Lightning Generative Detail Restorator

Title: VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation

Title: GUIDE-CoT: Goal-driven and User-Informed Dynamic Estimation for Pedestrian Trajectory using Chain-of-Thought

Title: AttFC: Attention Fully-Connected Layer for Large-Scale Face Recognition with One GPU

Title: From Image- to Pixel-level: Label-efficient Hyperspectral Image Reconstruction

Title: Towards Generalization of Tactile Image Generation: Reference-Free Evaluation in a Leakage-Free Setting

Title: ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration

Title: Text-to-Image Diffusion Models Cannot Count, and Prompt Refinement Cannot Help

Title: CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

Title: HiSTF Mamba: Hierarchical Spatiotemporal Fusion with Multi-Granular Body-Spatial Modeling for High-Fidelity Text-to-Motion Generation

Title: DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation

Title: From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Title: Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping

Title: Motion Anything: Any to Motion Generation

Title: LatexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending

Title: A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis

Title: Learning Decision Trees as Amortized Structure Inference

Title: ConcreTizer: Model Inversion Attack via Occupancy Classification and Dispersion Control for 3D Point Cloud Restoration

Title: NukesFormers: Unpaired Hyperspectral Image Generation with Non-Uniform Domain Alignment

Title: EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer

Title: Learning a Unified Degradation-aware Representation Model for Multi-modal Image Fusion

Title: Recovering Partially Corrupted Major Objects through Tri-modality Based Image Completion

Title: TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation

Title: Generative method for aerodynamic optimization based on classifier-free guided denoising diffusion probabilistic model

Title: Breaking the Limits of Quantization-Aware Defenses: QADT-R for Robustness Against Patch-Based Adversarial Attacks in QNNs

Title: NFIG: Autoregressive Image Generation with Next-Frequency Prediction

Title: Exposure Bias Reduction for Enhancing Diffusion Transformer Feature Caching

Title: Controllable 3D Outdoor Scene Generation via Scene Graphs

Title: Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms

Title: Effective and Efficient Masked Image Generation Models

Title: Synthetic Lung X-ray Generation through Cross-Attention and Affinity Transformation

Title: Boosting Diffusion-Based Text Image Super-Resolution Model Towards Generalized Real-World Scenarios

Title: WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Title: Automated Movie Generation via Multi-Agent CoT Planning

Title: Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment

Title: TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models

Title: PersonaBooth: Personalized Text-to-Motion Generation

Title: SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models

Title: TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision

Title: AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion

Title: Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

Title: VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models

Title: V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation

Title: LBM: Latent Bridge Matching for Fast Image-to-Image Translation

Title: Inductive Moment Matching

Title: Denoising Score Distillation: From Noisy Diffusion Pretraining to One-Step High-Quality Generation

Title: Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning

Title: HumanMM: Global Human Motion Recovery from Multi-shot Videos

Title: VACE: All-in-One Video Creation and Editing

Title: DreamRelation: Relation-Centric Video Customization

Title: VoD: Learning Volume of Differences for Video-Based Deepfake Detection