2026-01-09

Title: STDD:Spatio-Temporal Dynamics-Driven Token Refinement in Diffusion Language Models

Title: Leveraging Language Models and RAG for Efficient Knowledge Discovery in Clinical Environments

Title: LEGATO: Good Identity Unlearning Is Continuous

Title: Beyond Binary Preference: Aligning Diffusion Models to Fine-grained Criteria by Decoupling Attributes

Title: Quantifying the Effect of Test Set Contamination on Generative Evaluations

Title: Embedding Textual Information in Images Using Quinary Pixel Combinations

Title: ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers

Title: PackCache: A Training-Free Acceleration Method for Unified Autoregressive Video Generation via Compact KV-Cache

Title: Few-Shot LoRA Adaptation of a Flow-Matching Foundation Model for Cross-Spectral Object Detection

Title: From Preoperative CT to Postmastoidectomy Mesh Construction:1Mastoidectomy Shape Prediction for Cochlear Implant Surgery

Title: CRUNet-MR-Univ: A Foundation Model for Diverse Cardiac MRI Reconstruction

Title: UniDrive-WM: Unified Understanding, Planning and Generation World Model For Autonomous Driving

Title: Meta-probabilistic Modeling

Title: Concept Tokens: Learning Behavioral Embeddings Through Concept Definitions

Title: Surface-based Molecular Design with Multi-modal Flow Matching

Title: Spatial-Temporal Feedback Diffusion Guidance for Controlled Traffic Imputation

Title: 3D Conditional Image Synthesis of Left Atrial LGE MRI from Composite Semantic Masks

Title: On the Limitations of Rank-One Model Editing in Answering Multi-hop Questions

Title: HATIR: Heat-Aware Diffusion for Turbulent Infrared Video Super-Resolution

Title: Do LLMs Benefit from User and Item Embeddings in Recommendation Tasks?

Title: See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation

Title: Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

Title: Skeletonization-Based Adversarial Perturbations on Large Vision Language Model's Mathematical Text Recognition

Title: CounterVid: Counterfactual Video Generation for Mitigating Action and Temporal Hallucinations in Video-Language Models

Title: Measurement-Consistent Langevin Corrector: A Remedy for Latent Diffusion Inverse Solvers

Title: PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference

Title: Detector-Augmented SAMURAI for Long-Duration Drone Tracking

Title: Token Maturation: Autoregressive Language Generation via Continuous Token Dynamics

Title: DivAS: Interactive 3D Segmentation of NeRFs via Depth-Weighted Voxel Aggregation

Title: OceanSplat: Object-aware Gaussian Splatting with Trinocular View Consistency for Underwater Scene Reconstruction

Title: Patch-based Representation and Learning for Efficient Deformation Modeling

Title: DeepWeightFlow: Re-Basined Flow Matching for Generating Neural Network Weights

Title: SemPA: Improving Sentence Embeddings of Large Language Models through Semantic Preference Alignment

Title: UniLiPs: Unified LiDAR Pseudo-Labeling with Geometry-Grounded Dynamic Scene Decomposition

Title: Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing

Title: VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

Title: Atlas 2 - Foundation models for clinical deployment

Title: FlowLet: Conditional 3D Brain MRI Synthesis using Wavelet Flow Matching

Title: Plenoptic Video Generation

Title: RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation

Title: Pixel-Perfect Visual Geometry Estimation

Title: Mesh4D: 4D Mesh Reconstruction and Tracking from Monocular Video