2025-12-05

Title: ASCIIBench: Evaluating Language-Model-Based Understanding of Visually-Oriented Text

Title: Decoding Large Language Diffusion Models with Foreseeing Movement

Title: MechDetect: Detecting Data-Dependent Errors

Title: Beyond Flicker: Detecting Kinematic Inconsistencies for Generalizable Deepfake Video Detection

Title: MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis

Title: ReasonX: MLLM-Guided Intrinsic Image Decomposition

Title: ActVAE: Modelling human activity schedules with a deep conditional generative approach

Title: MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models

Title: UniLight: A Unified Representation for Lighting

Title: Inference-time Stochastic Refinement of GRU-Normalizing Flow for Real-time Video Motion Transfer

Title: Plug-and-Play Image Restoration with Flow Matching: A Continuous Viewpoint

Title: Learning Single-Image Super-Resolution in the JPEG Compressed Domain

Title: Text-Only Training for Image Captioning with Retrieval Augmentation and Modality Gap Correction

Title: Bayes-DIC Net: Estimating Digital Image Correlation Uncertainty with Bayesian Neural Networks

Title: A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks

Title: Open Set Face Forgery Detection via Dual-Level Evidence Collection

Title: Data-regularized Reinforcement Learning for Diffusion Models at Scale

Title: Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models

Title: FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring

Title: Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation

Title: MindDrive: An All-in-One Framework Bridging World Models and Vision-Language Model for End-to-End Autonomous Driving

Title: StreamEQA: Towards Streaming Video Understanding for Embodied Scenarios

Title: GuidNoise: Single-Pair Guided Diffusion for Generalized Noise Synthesis

Title: dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning

Title: UniTS: Unified Time Series Generative Model for Remote Sensing

Title: GraphBench: Next-generation graph learning benchmarking

Title: DeRA: Decoupled Representation Alignment for Video Tokenization

Title: Not All Birds Look The Same: Identity-Preserving Generation For Birds

Title: Controllable Long-term Motion Generation with Extended Joint Targets

Title: Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model

Title: UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers

Title: EgoLCD: Egocentric Video Generation with Long Context Diffusion

Title: VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory

Title: X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale

Title: VideoMem: Enhancing Ultra-Long Video Understanding via Adaptive Memory Management

Title: On the Limits of Test-Time Compute: Sequential Reward Filtering for Better Inference

Title: LeMat-GenBench: A Unified Evaluation Framework for Crystal Generative Models

Title: COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

Title: Natural Language Actor-Critic: Scalable Off-Policy Learning in Language Space

Title: Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Title: Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Title: TimesNet-Gen: Deep Learning-based Site Specific Strong Motion Generation

Title: OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

Title: Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild

Title: RLHFSpec: Breaking the Efficiency Bottleneck in RLHF Training via Adaptive Drafting

Title: Order Matters: 3D Shape Generation from Sequential VR Sketches

Title: MemLoRA: Distilling Expert Adapters for On-Device Memory Systems

Title: PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Title: LaFiTe: A Generative Latent Field for 3D Native Texturing

Title: EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Title: LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation

Title: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis

Title: Tokenizing Buildings: A Transformer for Layout Synthesis

Title: Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens

Title: Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing

Title: ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching

Title: Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion

Title: Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis

Title: Rethinking the Use of Vision Transformers for AI-Generated Image Detection

Title: Efficient Generative Transformer Operators For Million-Point PDEs

Title: Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

Title: Reflection Removal through Efficient Adaptation of Diffusion Transformers

Title: Generative Neural Video Compression via Video Diffusion Prior

Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding

Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Title: BulletTime: Decoupled Control of Time and Camera Pose for Video Generation

Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints

Title: OMTRA: A Multi-Task Generative Model for Structure-Based Drug Design

Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression

Title: SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards

Title: TV2TV: A Unified Framework for Interleaved Language and Video Generation

Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency Modulation

Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional Art

Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

Title: Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Title: Value Gradient Guidance for Flow Matching Alignment