2025-12-16

Title: Active Inference with Reusable State-Dependent Value Profiles

Title: CR3G: Causal Reasoning for Patient-Centric Explanations in Radiology Report Generation

Title: Generative Stochastic Optimal Transport: Guided Harmonic Path-Integral Diffusion

Title: Hierarchical Task Offloading and Trajectory Optimization in Low-Altitude Intelligent Networks Via Auction and Diffusion-based MARL

Title: On the Dangers of Bootstrapping Generation for Continual Learning and Beyond

Title: mmWEAVER: Environment-Specific mmWave Signal Synthesis from a Photo and Activity Description

Title: MPath: Multimodal Pathology Report Generation from Whole Slide Images

Title: FloraForge: LLM-Assisted Procedural Generation of Editable and Analysis-Ready 3D Plant Geometric Models For Agricultural Applications

Title: Learning to Extract Context for Context-Aware LLM Inference

Title: CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos

Title: SigTime: Learning and Visually Explaining Time Series Signatures

Title: BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models

Title: RePack: Representation Packing of Vision Foundation Model Features Enhances Diffusion Transformer

Title: CLOAK: Contrastive Guidance for Latent Diffusion-Based Data Obfuscation

Title: SPDMark: Selective Parameter Displacement for Robust Video Watermarking

Title: High-Dimensional Tensor Discriminant Analysis: Low-Rank Discriminant Structure, Representation Synergy, and Theoretical Guarantees

Title: HydroDiffusion: Diffusion-Based Probabilistic Streamflow Forecasting with a State Space Backbone

Title: SMRABooth: Subject and Motion Representation Alignment for Customized Video Generation

Title: MolGuidance: Advanced Guidance Strategies for Conditional Molecular Generation with Flow Matching

Title: A Hybrid Deep Learning Framework for Emotion Recognition in Children with Autism During NAO Robot-Mediated Interaction

Title: CineLOG: A Training Free Approach for Cinematic Long Video Generation

Title: ProImage-Bench: Rubric-Based Evaluation for Professional Image Generation

Title: Ultra-Low Bitrate Perceptual Image Compression with Shallow Encoder

Title: Moment and Highlight Detection via MLLM Frame Segmentation

Title: Cognitive-YOLO: LLM-Driven Architecture Synthesis from First Principles of Data for Object Detection

Title: MRD: Using Physically Based Differentiable Rendering to Probe Vision Models for 3D Scene Understanding

Title: WeDetect: Fast Open-Vocabulary Object Detection as Retrieval

Title: Unified Control for Inference-Time Guidance of Denoising Diffusion Models

Title: Synthetic Swarm Mosquito Dataset for Acoustic Classification: A Proof of Concept

Title: STAGE: Storyboard-Anchored Generation for Cinematic Multi-shot Narrative

Title: V-Warper: Appearance-Consistent Video Diffusion Personalization via Value Warping

Title: Anchoring Values in Temporal and Group Dimensions for Flow Matching Model Alignment

Title: ArtGen: Conditional Generative Modeling of Articulated Objects in Arbitrary Part-Level States

Title: BokehDepth: Enhancing Monocular Depth Estimation through Bokeh Generation

Title: Endless World: Real-Time 3D-Aware Long Video Generation

Title: Exploring the Design Space of Transition Matching

Title: Generative Spatiotemporal Data Augmentation

Title: Differentiable Energy-Based Regularization in GANs: A Simulator-Based Exploration of VQE-Inspired Auxiliary Losses

Title: Vision-Enhanced Large Language Models for High-Resolution Image Synthesis and Multimodal Data Interpretation

Title: Content-Aware Ad Banner Layout Generation with Two-Stage Chain-of-Thought in Vision Language Models

Title: Geometry-Aware Scene-Consistent Image Generation

Title: No Cache Left Idle: Accelerating diffusion model via Extreme-slimming Caching

Title: Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space

Title: DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model

Title: InteracTalker: Prompt-Based Human-Object Interaction with Co-Speech Gesture Generation

Title: DynaGen: Unifying Temporal Knowledge Graph Reasoning with Dynamic Subgraphs and Generative Regularization

Title: On Approaches to Building Surrogate ODE Models for Diffusion Bridges

Title: Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Title: Robust Motion Generation using Part-level Reliable Data from Videos

Title: Spinal Line Detection for Posture Evaluation through Train-ing-free 3D Human Body Reconstruction with 2D Depth Images

Title: GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation

Title: FysicsWorld: A Unified Full-Modality Benchmark for Any-to-Any Understanding, Generation, and Reasoning

Title: CoRe3D: Collaborative Reasoning as a Foundation for 3D Intelligence

Title: Fast 2DGS: Efficient Image Representation with Deep Gaussian Prior

Title: Credit Risk Estimation with Non-Financial Features: Evidence from a Synthetic Istanbul Dataset

Title: Learning Common and Salient Generative Factors Between Two Image Datasets

Title: On the continuity of flows

Title: Adapting Multimodal Foundation Models for Few-Shot Learning: A Comprehensive Study on Contrastive Captioners

Title: Network Level Evaluation of Hangup Susceptibility of HRGCs using Deep Learning and Sensing Techniques: A Goal Towards Safer Future

Title: Information-Consistent Language Model Recommendations through Group Relative Policy Optimization

Title: SignRAG: A Retrieval-Augmented System for Scalable Zero-Shot Road Sign Recognition

Title: Revisiting 2D Foundation Models for Scalable 3D Medical Image Classification

Title: Distillation of Discrete Diffusion by Exact Conditional Distribution Matching

Title: Wait, Wait, Wait... Why Do Reasoning Models Loop?

Title: Qonvolution: Towards Learning High-Frequency Signals with Queried Convolution

Title: Next-generation reservoir computing validated by classification task

Title: Few-Step Distillation for Text-to-Image Generation: A Practical Guide

Title: JoDiffusion: Jointly Diffusing Image with Pixel-Level Annotations for Semantic Segmentation Promotion

Title: What Happens Next? Next Scene Prediction with a Unified Video Model

Title: SneakPeek: Future-Guided Instructional Streaming Video Generation

Title: Motus: A Unified Latent Action World Model

Title: Bi-Erasing: A Bidirectional Framework for Concept Removal in Diffusion Models

Title: Forging a Dynamic Memory: Retrieval-Guided Continual Learning for Generalist Medical Foundation Models

Title: Diffusion-Based Restoration for Multi-Modal 3D Object Detection in Adverse Weather

Title: A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis

Title: POLAR: A Portrait OLAT Dataset and Generative Framework for Illumination-Aware Face Modeling

Title: Evaluating Adversarial Attacks on Federated Learning for Temperature Forecasting

Title: CORE: Contrastive Masked Feature Reconstruction on Graphs

Title: Learning to Retrieve with Weakened Labels: Robust Training under Label Noise

Title: BézierFlow: Bézier Stochastic Interpolant Schedulers for Few-Step Generation

Title: Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans?

Title: CausalCLIP: Causally-Informed Feature Disentanglement and Filtering for Generalizable Detection of Generated Images

Title: LINA: Learning INterventions Adaptively for Physical Alignment and Generalization in Diffusion Models

Title: ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement

Title: KlingAvatar 2.0 Technical Report

Title: ALIGN-FL: Architecture-independent Learning through Invariant Generative component sharing in Federated Learning

Title: Beyond the Visible: Disocclusion-Aware Editing via Proxy Dynamic Graphs

Title: Computer vision training dataset generation for robotic environments using Gaussian splatting

Title: Learning to Generate Cross-Task Unexploitable Examples

Title: RecTok: Reconstruction Distillation along Rectified Flow

Title: Test-Time Modification: Inverse Domain Transformation for Robust Perception

Title: PoseAnything: Universal Pose-guided Video Generation with Part-aware Temporal Coherence

Title: Transform Trained Transformer: Accelerating Naive 4K Video Generation Over 10$\times$

Title: Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation

Title: Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

Title: MMhops-R1: Multimodal Multi-hop Reasoning

Title: Image Diffusion Preview with Consistency Solver

Title: LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Title: Do-Undo: Generating and Reversing Physical Actions in Vision-Language Models

Title: StutterFuse: Mitigating Modality Collapse in Stuttering Detection with Jaccard-Weighted Metric Learning and Gated Fusion

Title: Charge: A Comprehensive Novel View Synthesis Benchmark and Dataset to Bind Them All

Title: Grab-3D: Detecting AI-Generated Videos from 3D Geometric Temporal Consistency

Title: A Scientific Reasoning Model for Organic Synthesis Procedure Generation

Title: Directional Textual Inversion for Personalized Text-to-Image Generation

Title: JoVA: Unified Multimodal Learning for Joint Video-Audio Generation

Title: Feedforward 3D Editing via Text-Steerable Image-to-3D

Title: I-Scene: 3D Instance Models are Implicit Generalizable Spatial Learners

Title: Towards Scalable Pre-training of Visual Tokenizers for Generation

Title: DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders