2026-01-16

Title: NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration

Title: ViSIL: Unified Evaluation of Information Loss in Multimodal Video Captioning

Title: VibrantSR: Sub-Meter Canopy Height Models from Sentinel-2 Using Generative Flow Matching

Title: MedVL-SAM2: A unified 3D medical vision-language model for multimodal reasoning and prompt-driven segmentation

Title: Transition Matching Distillation for Fast Video Generation

Title: In-Context Operator Learning on the Space of Probability Measures

Title: FaTRQ: Tiered Residual Quantization for LLM Vector Search in Far-Memory-Aware ANNS Systems

Title: DW-DGAT: Dynamically Weighted Dual Graph Attention Network for Neurodegenerative Disease Diagnosis

Title: Continuous-Depth Transformers with Learned Control Dynamics

Title: CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Title: Difficulty-guided Sampling: Bridging the Target Gap between Dataset Distillation and Downstream Tasks

Title: Multilingual-To-Multimodal (M2M): Unlocking New Languages with Monolingual Text

Title: FlowAct-R1: Towards Interactive Humanoid Video Generation

Title: LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning

Title: RAG-3DSG: Enhancing 3D Scene Graphs with Re-Shot Guided Retrieval-Augmented Generation

Title: From Physical Degradation Models to Task-Aware All-in-One Image Restoration

Title: ELITE: Efficient Gaussian Head Avatar from a Monocular Video via Learned Initialization and TEst-time Generative Adaptation

Title: Beyond Inpainting: Unleash 3D Understanding for Precise Camera-Controlled Video Generation

Title: ROMA: Real-time Omni-Multimodal Assistant with Interactive Streaming Understanding

Title: Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders

Title: EvoMorph: Counterfactual Explanations for Continuous Time-Series Extrinsic Regression Applied to Photoplethysmography

Title: Fine-Grained Human Pose Editing Assessment via Layer-Selective MLLMs

Title: Towards Efficient Low-rate Image Compression with Frequency-aware Diffusion Prior Refinement

Title: Global Context Compression with Interleaved Vision-Text Transformation

Title: Discrete Feynman-Kac Correctors

Title: CS-GBA: A Critical Sample-based Gradient-guided Backdoor Attack for Offline Reinforcement Learning

Title: DeFlow: Decoupling Manifold Modeling and Value Maximization for Offline Policy Extraction

Title: Unleashing the Capabilities of Large Vision-Language Models for Intelligent Perception of Roadside Infrastructure

Title: Inference-time Physics Alignment of Video Generative Models with Latent World Models

Title: RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation

Title: CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos

Title: Single-Stage Huffman Encoder for ML Compression

Title: On the origin of neural scaling laws: from random graphs to natural language