2025-12-09

Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices

Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head

Title: Domain-Specific Foundation Model Improves AI-Based Analysis of Neuropathology

Title: PrunedCaps: A Case For Primary Capsules Discrimination

Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT

Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation

Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation

Title: Deep learning recognition and analysis of Volatile Organic Compounds based on experimental and synthetic infrared absorption spectra

Title: When Privacy Isn't Synthetic: Hidden Data Leakage in Generative AI Models

Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation

Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation

Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction

Title: How Should We Evaluate Data Deletion in Graph-Based ANN Indexes?

Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension

Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models

Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate

Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search

Title: DDFI: Diverse and Distribution-aware Missing Feature Imputation via Two-step Reconstruction

Title: Rectifying Latent Space for Generative Single-Image Reflection Removal

Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection

Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework

Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation

Title: DragMesh: Interactive 3D Generation Made Easy

Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars

Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction

Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion

Title: Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities

Title: A Fast and Effective Solution to the Problem of Look-ahead Bias in LLMs

Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution

Title: GSAE: Graph-Regularized Sparse Autoencoders for Robust LLM Safety Steering

Title: Personalized Image Descriptions from Attention Sequences

Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods

Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models

Title: GradientSpace: Unsupervised Data Clustering for Improved Instruction Tuning

Title: Mitigating Barren plateaus in quantum denoising diffusion probabilistic models

Title: Pathway to $O(\sqrt{d})$ Complexity bound under Wasserstein metric of flow-based models

Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement

Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors

Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation

Title: Partial Inverse Design of High-Performance Concrete Using Cooperative Neural Networks for Constraint-Aware Mix Generation

Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection

Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT

Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training

Title: Spatial Retrieval Augmented Autonomous Driving

Title: JoPano: Unified Panorama Generation via Joint Modeling

Title: Scaling Zero-Shot Reference-to-Video Generation

Title: Evaluating the Sensitivity of BiLSTM Forecasting Models to Sequence Length and Input Noise

Title: Hidden Leaks in Time Series Forecasting: How Data Leakage Affects LSTM Evaluation Across Configurations and Validation Strategies

Title: Evaluating and Preserving High-level Fidelity in Super-Resolution

Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection

Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On

Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models

Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers

Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics

Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing

Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach

Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration

Title: Improving the Throughput of Diffusion-based Large Language Models via a Training-Free Confidence-Aware Calibration

Title: UniDiff: A Unified Diffusion Framework for Multimodal Time Series Forecasting

Title: START: Spatial and Textual Learning for Chart Understanding

Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression

Title: Generating Storytelling Images with Rich Chains-of-Reasoning

Title: Understanding Diffusion Models via Code Execution

Title: Unified Camera Positional Encoding for Controlled Video Generation

Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement

Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data

Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation

Title: Generalized Referring Expression Segmentation on Aerial Photos

Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting

Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition

Title: Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance

Title: Materium: An Autoregressive Approach for Material Generation

Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior

Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer

Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation

Title: MeshRipple: Structured Autoregressive Generation of Artist-Meshes

Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images

Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models

Title: LongCat-Image Technical Report

Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation

Title: Optimization-Guided Diffusion for Interactive Scene Generation

Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment

Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation

Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination

Title: HLTCOE Evaluation Team at TREC 2025: VQA Track

Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and Generation

Title: Distribution Matching Variational AutoEncoder

Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling

Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation

Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Title: Voxify3D: Pixel Art Meets Volumetric Rendering