2025-05-30

Title: HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer

Title: Rhetorical Text-to-Image Generation via Two-layer Diffusion Policy Optimization

Title: Preference Learning with Response Time

Title: PGLearn -- An Open-Source Learning Toolkit for Optimal Power Flow

Title: Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study

Title: RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

Title: CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting

Title: Scaling Offline RL via Efficient and Expressive Shortcut Models

Title: CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models

Title: Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape

Title: Leveraging Diffusion Models for Synthetic Data Augmentation in Protein Subcellular Localization Classification

Title: ATI: Any Trajectory Instruction for Controllable Video Generation

Title: Directed Graph Grammars for Sequence-based Learning

Title: MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming

Title: EquiReg: Equivariance Regularized Diffusion for Inverse Problems

Title: Toward Memory-Aided World Models: Benchmarking via Spatial Consistency

Title: HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions

Title: MOVi: Training-free Text-conditioned Multi-Object Video Generation

Title: EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

Title: SeG-SR: Integrating Semantic Knowledge into Remote Sensing Image Super-Resolution via Vision-Language Model

Title: $K^2$VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting

Title: Are Unified Vision-Language Models Necessary: Generalization Across Understanding and Generation

Title: Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object

Title: DINGO: Constrained Inference for Diffusion LLMs

Title: URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration

Title: GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion

Title: Diffusion-Based Generative Models for 3D Occupancy Prediction in Autonomous Driving

Title: TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance

Title: MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation

Title: HMAD: Advancing E2E Driving with Anchored Offset Proposals and Simulation-Supervised Multi-target Scoring

Title: Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing

Title: VERINA: Benchmarking Verifiable Code Generation

Title: Implicit Inversion turns CLIP into a Decoder

Title: RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer

Title: Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging

Title: HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image

Title: Fooling the Watchers: Breaking AIGC Detectors via Semantic Prompt Attacks

Title: HyperPointFormer: Multimodal Fusion in 3D Space with Dual-Branch Cross-Attention Transformers

Title: Generalizability vs. Counterfactual Explainability Trade-Off

Title: Advancing Image Super-resolution Techniques in Remote Sensing: A Comprehensive Survey

Title: UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes

Title: Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs

Title: Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs

Title: LADA: Scalable Label-Specific CLIP Adapter for Continual Learning

Title: RSFAKE-1M: A Large-Scale Dataset for Detecting Diffusion-Generated Remote Sensing Forgeries

Title: GenCAD-Self-Repairing: Feasibility Enhancement for 3D CAD Generation

Title: Score-based Generative Modeling for Conditional Independence Testing

Title: TRACE: Trajectory-Constrained Concept Erasure in Diffusion Models

Title: Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis

Title: Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization

Title: Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering

Title: Beyond Optimal Transport: Model-Aligned Coupling for Flow Matching

Title: UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Title: Automated Modeling Method for Pathloss Model Discovery

Title: Video Editing for Audio-Visual Dubbing

Title: Bidirectional predictive coding

Title: CryoCCD: Conditional Cycle-consistent Diffusion with Biophysical Modeling for Cryo-EM Synthesis

Title: A Reverse Causal Framework to Mitigate Spurious Correlations for Debiasing Scene Graph Generation

Title: Diffusion Guidance Is a Controllable Policy Improvement Operator

Title: LAFR: Efficient Diffusion-based Blind Face Restoration via Latent Codebook Alignment Adapter

Title: VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation

Title: R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation

Title: Maximum Likelihood Learning of Latent Dynamics Without Reconstruction

Title: BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model

Title: LLM Performance for Code Generation on Noisy Tasks

Title: Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Title: Inference-time Scaling of Diffusion Models through Classical Search

Title: MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment

Title: VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

Title: D-AR: Diffusion via Autoregressive Models

Title: OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation

Title: AMBER: Adaptive Mesh Generation by Iterative Mesh Resolution Prediction

Title: ImmunoDiff: A Diffusion Model for Immunotherapy Response Prediction in Lung Cancer

Title: VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos

Title: DiCoFlex: Model-agnostic diverse counterfactuals with flexible control

Title: PixelThink: Towards Efficient Chain-of-Pixel Reasoning

Title: How Animals Dance (When You're Not Looking)

Title: MAGREF: Masked Guidance for Any-Reference Video Generation

Title: DarkDiff: Advancing Low-Light Raw Enhancement by Retasking Diffusion Models for Camera ISP

Title: To Trust Or Not To Trust Your Vision-Language Model's Prediction

Title: LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers