2025-03-25

Title: State Fourier Diffusion Language Model (SFDLM): A Scalable, Novel Iterative Approach to Language Modeling

Title: IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D Scenes

Title: Generative Modeling of Class Probability for Multi-Modal Representation Learning

Title: V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms

Title: LEMMA: Learning from Errors for MatheMatical Advancement in LLMs

Title: Bayesian generative models can flag performance loss, bias, and out-of-distribution image content

Title: What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models

Title: Towards Understanding the Benefits of Neural Network Parameterizations in Geophysical Inversions: A Study With Neural Fields

Title: DermDiff: Generative Diffusion Model for Mitigating Racial Biases in Dermatology Diagnosis

Title: Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks

Title: PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning

Title: Large Language Models Can Verbatim Reproduce Long Malicious Sequences

Title: Generating Realistic, Diverse, and Fault-Revealing Inputs with Latent Space Interpolation for Testing Deep Neural Networks

Title: On The Sample Complexity Bounds In Bilevel Reinforcement Learning

Title: Efficient Diffusion Training through Parallelization with Truncated Karhunen-Loève Expansion

Title: OMR-Diffusion:Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Intent Understanding

Title: 3D Modeling: Camera Movement Estimation and path Correction for SFM Model using the Combination of Modified A-SIFT and Stereo System

Title: TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation

Title: MultiScale Contextual Bandits for Long Term Objectives

Title: Towards Transformer-Based Aligned Generation with Self-Coherence Guidance

Title: Safe RLHF-V: Safe Reinforcement Learning from Human Feedback in Multimodal Large Language Models

Title: MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion

Title: MAMAT: 3D Mamba-Based Atmospheric Turbulence Removal and its Object Detection Capability

Title: CODA: Repurposing Continuous VAEs for Discrete Tokenization

Title: Renewable Energy Transition in South America: Predictive Analysis of Generation Capacity by 2050

Title: Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM

Title: Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models

Title: Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration

Title: A Causal Adjustment Module for Debiasing Scene Graph Generation

Title: Guided Diffusion for the Extension of Machine Vision to Human Visual Perception

Title: TransAnimate: Taming Layer Diffusion to Generate RGBA Video

Title: Cross-Domain Underwater Image Enhancement Guided by No-Reference Image Quality Assessment: A Transfer Learning Approach

Title: PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos

Title: Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook

Title: Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation

Title: Vehicular Road Crack Detection with Deep Learning: A New Online Benchmark for Comprehensive Evaluation of Existing Algorithms

Title: Unified Geometry and Color Compression Framework for Point Clouds via Generative Diffusion Priors

Title: Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization

Title: An Image-like Diffusion Method for Human-Object Interaction Detection

Title: TCFG: Tangential Damping Classifier-free Guidance

Title: AGIR: Assessing 3D Gait Impairment with Reasoning based on LLMs

Title: LocDiffusion: Identifying Locations on Earth by Diffusing in the Hilbert Space

Title: LongDiff: Training-Free Long Video Generation in One Go

Title: Decorum: A Language-Based Approach For Style-Conditioned Synthesis of Indoor 3D Scenes

Title: DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation

Title: Self-Attention Diffusion Models for Zero-Shot Biomedical Image Segmentation: Unlocking New Frontiers in Medical Imaging

Title: A Framework for Finding Local Saddle Points in Two-Player Zero-Sum Black-Box Games

Title: Decoupling Angles and Strength in Low-rank Adaptation

Title: DiffGED: Computing Graph Edit Distance via Diffusion-based Graph Matching

Title: CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

Title: Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module

Title: Diff-Palm: Realistic Palmprint Generation with Polynomial Creases and Intra-Class Variation Controllable Diffusion Models

Title: Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization

Title: Plug-and-Play Interpretable Responsible Text-to-Image Generation via Dual-Space Multi-facet Concept Control

Title: GranQ: Granular Zero-Shot Quantization with Unified Layer-Channel Awareness

Title: Human-Object Interaction with Vision-Language Model Guided Relative Movement Dynamics

Title: Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models

Title: Context-Enhanced Memory-Refined Transformer for Online Action Detection

Title: DiffusedWrinkles: A Diffusion-Based Model for Data-Driven Garment Animation

Title: Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance

Title: Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning

Title: Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning

Title: U-REPA: Aligning Diffusion U-Nets to ViTs

Title: Panorama Generation From NFoV Image Done Right

Title: Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation

Title: ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation

Title: Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Title: InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Title: Hiding Images in Diffusion Models by Editing Learned Score Functions

Title: MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing

Title: PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models

Title: MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

Title: Global-Local Tree Search for Language Guided 3D Scene Generation

Title: Can Text-to-Video Generation help Video-Language Alignment?

Title: Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model

Title: RLCAD: Reinforcement Learning Training Gym for Revolution Involved CAD Command Sequence Generation

Title: EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation

Title: AMD-Hummingbird: Towards an Efficient Text-to-Video Model

Title: Anchor-based oversampling for imbalanced tabular data via contrastive and adversarial learning

Title: Adapting Video Diffusion Models for Time-Lapse Microscopy

Title: Adventurer: Exploration with BiGAN for Deep Reinforcement Learning

Title: Generative Dataset Distillation using Min-Max Diffusion Model

Title: Dig2DIG: Dig into Diffusion Information Gains for Image Fusion

Title: Leveraging Land Cover Priors for Isoprene Emission Super-Resolution

Title: Human Motion Unlearning

Title: NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Title: Benchmarking Burst Super-Resolution for Polarization Images: Noise Dataset and Analysis

Title: GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting

Title: Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings

Title: Simulation-Driven Balancing of Competitive Game Levels with Reinforcement Learning

Title: 3DSwapping: Texture Swapping For 3D Object From Single Reference Image

Title: Reasoning to Learn from Latent Thoughts

Title: A semantic communication-based workload-adjustable transceiver for wireless AI-generated content (AIGC) delivery

Title: CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

Title: Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

Title: Training-free Diffusion Acceleration with Bottleneck Sampling

Title: Video-T1: Test-Time Scaling for Video Generation

Title: Aether: Geometric-Aware Unified World Modeling

Title: Equivariant Image Modeling