2026-01-21

Title: GRADE: Replacing Policy Gradients with Backpropagation for LLM Alignment

Title: Multi-modal MRI-Based Alzheimer's Disease Diagnosis with Transformer-based Image Synthesis and Transfer Learning

Title: A one-step generation model with a Single-Layer Transformer: Layer number re-distillation of FreeFlow

Title: Now You See Me, Now You Don't: A Unified Framework for Expression Consistent Anonymization in Talking Head Videos

Title: Global Optimization By Gradient from Hierarchical Score-Matching Spaces

Title: Mixture of Distributions Matters: Dynamic Sparse Attention for Efficient Video Diffusion Transformers

Title: Predicting When to Trust Vision-Language Models for Spatial Reasoning

Title: Aesthetics as Structural Harm: Algorithmic Lookism Across Text-to-Image Generation and Classification

Title: Generating metamers of human scene understanding

Title: Telling Human and Machine Handwriting Apart

Title: MixFlow: Mixture-Conditioned Flow Matching for Out-of-Distribution Generalization

Title: TF-CoDiT: Conditional Time Series Synthesis with Diffusion Transformers for Treasury Futures

Title: DevBench: A Realistic, Developer-Informed Benchmark for Code Generation Models

Title: RemoteVAR: Autoregressive Visual Modeling for Remote Sensing Change Detection

Title: Decoder Gradient Shields: A Family of Provable and High-Fidelity Methods Against Gradient-Based Box-Free Watermark Removal

Title: R$^2$PO: Decoupling Training Trajectories from Inference Responses for LLM Reasoning

Title: AVIR: Adaptive Visual In-Document Retrieval for Efficient Multi-Page Document Question Answering

Title: Task-Driven Prompt Learning: A Joint Framework for Multi-modal Cloud Removal and Segmentation

Title: Learning Stochastic Bridges for Video Object Removal via Video-to-Video Translation

Title: ARMARecon: An ARMA Convolutional Filter based Graph Neural Network for Neurodegenerative Dementias Classification

Title: CroBIM-V: Memory-Quality Controlled Remote Sensing Referring Video Object Segmentation

Title: RCDN: Real-Centered Detection Network for Robust Face Forgery Identification

Title: SynQP: A Framework and Metrics for Evaluating the Quality and Privacy Risk of Synthetic Data

Title: Speculative Sampling with Reinforcement Learning

Title: S^2F-Net:A Robust Spatial-Spectral Fusion Framework for Cross-Model AIGC Detection

Title: MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Title: From Prompts to Pavement: LMMs-based Agentic Behavior-Tree Generation Framework for Autonomous Vehicles

Title: Utilizing the Score of Data Distribution for Hyperspectral Anomaly Detection

Title: Class-Partitioned VQ-VAE and Latent Flow Matching for Point Cloud Scene Generation

Title: Beyond the Dirac Delta: Mitigating Diversity Collapse in Reinforcement Fine-Tuning for Versatile Image Generation

Title: SDCoNet: Saliency-Driven Multi-Task Collaborative Network for Remote Sensing Object Detection

Title: Towards Robust Universal Perturbation Attacks: A Float-Coded, Penalty-Driven Evolutionary Approach

Title: VILTA: A VLM-in-the-Loop Adversary for Enhancing Driving Policy Robustness

Title: Fusion-Restoration Image Processing Algorithm to Improve the High-Temperature Deformation Measurement

Title: S2DiT: Sandwich Diffusion Transformer for Mobile Streaming Video Generation

Title: SSPFormer: Self-Supervised Pretrained Transformer for MRI Images

Title: Moaw: Unleashing Motion Awareness for Video Diffusion Models

Title: Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image

Title: A Generalist Foundation Model for Total-body PET/CT Enables Diagnostic Reporting and System-wide Metabolic Profiling

Title: Generating Cyclic Conformers with Flow Matching in Cremer-Pople Coordinates

Title: Exploring Talking Head Models With Adjacent Frame Prior for Speech-Preserving Facial Expression Manipulation

Title: TwoHead-SwinFPN: A Unified DL Architecture for Synthetic Manipulation, Detection and Localization in Identity Documents

Title: Dual-Stream Collaborative Transformer for Image Captioning

Title: StyMam: A Mamba-Based Generator for Artistic Style Transfer

Title: Early Prediction of Type 2 Diabetes Using Multimodal data and Tabular Transformers

Title: Prototype Learning-Based Few-Shot Segmentation for Low-Light Crack on Concrete Structures

Title: Recursive Meta-Distillation: An Axiomatic Framework for Iterative Knowledge Refinement

Title: PhaseMark: A Post-hoc, Optimization-Free Watermarking of AI-generated Images in the Latent Frequency Domain

Title: FastAV: Efficient Token Pruning for Audio-Visual Large Language Model Inference

Title: LAViG-FLOW: Latent Autoregressive Video Generation for Fluid Flow Simulations

Title: A Comprehensive Evaluation of LLM Reasoning: From Single-Model to Multi-Agent Paradigms

Title: Enginuity: Building an Open Multi-Domain Dataset of Complex Engineering Diagrams

Title: Spherical Geometry Diffusion: Generating High-quality 3D Face Geometry via Sphere-anchored Representations

Title: Leveraging Transformer Decoder for Automotive Radar Object Detection

Title: Reasoning with Pixel-level Precision: QVLM Architecture and SQuID Dataset for Quantitative Geospatial Analytics

Title: Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study

Title: BladeSDF : Unconditional and Conditional Generative Modeling of Representative Blade Geometries Using Signed Distance Functions

Title: MN-TSG:Continuous Time Series Generation with Irregular Observations

Title: DiffFace-Edit: A Diffusion-Based Facial Dataset for Forgery-Semantic Driven Deepfake Detection Analysis

Title: Multi-objective fluorescent molecule design with a data-physics dual-driven generative framework

Title: Diffusion In Diffusion: Breaking the Autoregressive Bottleneck in Block Diffusion Models

Title: ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

Title: Dynamic Differential Linear Attention: Enhancing Linear Diffusion Transformer for High-Quality Image Generation

Title: Attention-space Contrastive Guidance for Efficient Hallucination Mitigation in LVLMs

Title: Who Should Have Surgery? A Comparative Study of GenAI vs Supervised ML for CRS Surgical Outcome Prediction

Title: Hierarchical Long Video Understanding with Audiovisual Entity Cohesion and Agentic Search

Title: Orthogonium : A Unified, Efficient Library of Orthogonal and 1-Lipschitz Building Blocks

Title: Principled Latent Diffusion for Graphs via Laplacian Autoencoders

Title: PREGEN: Uncovering Latent Thoughts in Composed Video Retrieval

Title: Inverting Self-Organizing Maps: A Unified Activation-Based Framework

Title: Multi-Objective Hierarchical Optimization with Large Language Models

Title: VTONGuard: Automatic Detection and Authentication of AI-Generated Virtual Try-On Content

Title: Likelihood-Separable Diffusion Inference for Multi-Image MRI Super-Resolution

Title: Human detectors are surprisingly powerful reward models

Title: Federated Balanced Learning

Title: LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems

Title: POCI-Diff: Position Objects Consistently and Interactively with 3D-Layout Guided Diffusion

Title: Fine-Grained Zero-Shot Composed Image Retrieval with Complementary Visual-Semantic Integration

Title: Interp3D: Correspondence-aware Interpolation for Generative Textured 3D Morphing

Title: The Side Effects of Being Smart: Safety Risks in MLLMs' Multi-Image Reasoning

Title: One-Shot Refiner: Boosting Feed-forward Novel View Synthesis via One-Step Diffusion

Title: Attention-Based Offline Reinforcement Learning and Clustering for Interpretable Sepsis Treatment

Title: Q-learning with Adjoint Matching

Title: Soft Tail-dropping for Adaptive Visual Tokenization

Title: OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Title: Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis

Title: VideoMaMa: Mask-Guided Video Matting via Generative Prior

Title: Implicit Neural Representation Facilitates Unified Universal Vision Encoding