2026-01-21

Title: Domain-Specific Self-Supervised Pre-training for Agricultural Disease Classification: A Hierarchical Vision Transformer Study

Title: Multi-modal MRI-Based Alzheimer's Disease Diagnosis with Transformer-based Image Synthesis and Transfer Learning

Title: A one-step generation model with a Single-Layer Transformer: Layer number re-distillation of FreeFlow

Title: Now You See Me, Now You Don't: A Unified Framework for Expression Consistent Anonymization in Talking Head Videos

Title: Evaluating Self-Correcting Vision Agents Through Quantitative and Qualitative Metrics

Title: Global Optimization By Gradient from Hierarchical Score-Matching Spaces

Title: Mixture of Distributions Matters: Dynamic Sparse Attention for Efficient Video Diffusion Transformers

Title: Predicting When to Trust Vision-Language Models for Spatial Reasoning

Title: Aesthetics as Structural Harm: Algorithmic Lookism Across Text-to-Image Generation and Classification

Title: UAV-Based Infrastructure Inspections: A Literature Review and Proposed Framework for AEC+FM

Title: Generating metamers of human scene understanding

Title: Attesting Model Lineage by Consisted Knowledge Evolution with Fine-Tuning Trajectory

Title: Telling Human and Machine Handwriting Apart

Title: jBOT: Semantic Jet Representation Clustering Emerges from Self-Distillation

Title: SpaRRTa: A Synthetic Benchmark for Evaluating Spatial Intelligence in Visual Foundation Models

Title: LIME-LLM: Probing Models with Fluent Counterfactuals, Not Broken Text

Title: studentSplat: Your Student Model Learns Single-view 3D Gaussian Splatting

Title: Cleansing the Artificial Mind: A Self-Reflective Detoxification Framework for Large Language Models

Title: Shapelets-Enriched Selective Forecasting using Time Series Foundation Models

Title: MixFlow: Mixture-Conditioned Flow Matching for Out-of-Distribution Generalization

Title: TF-CoDiT: Conditional Time Series Synthesis with Diffusion Transformers for Treasury Futures

Title: RemoteVAR: Autoregressive Visual Modeling for Remote Sensing Change Detection

Title: A Training-Free Guess What Vision Language Model from Snippets to Open-Vocabulary Object Detection

Title: Decoder Gradient Shields: A Family of Provable and High-Fidelity Methods Against Gradient-Based Box-Free Watermark Removal

Title: Hybrid IDS Using Signature-Based and Anomaly-Based Detection

Title: DIAMOND-SSS: Diffusion-Augmented Multi-View Optimization for Data-efficient SubSurface Scattering

Title: Don't Start Over: A Cost-Effective Framework for Migrating Personalized Prompts Between LLMs

Title: Learning Language-Driven Sequence-Level Modal-Invariant Representations for Video-Based Visible-Infrared Person Re-Identification

Title: Learning Stochastic Bridges for Video Object Removal via Video-to-Video Translation

Title: ARMARecon: An ARMA Convolutional Filter based Graph Neural Network for Neurodegenerative Dementias Classification

Title: Conditional Random Fields for Interactive Refinement of Histopathological Predictions

Title: Learning to Factorize and Adapt: A Versatile Approach Toward Universal Spatio-Temporal Foundation Models

Title: Principal Component Analysis-Based Terahertz Self-Supervised Denoising and Deblurring Deep Neural Networks

Title: Enhanced Diagnostic Performance via Large-Resolution Inference Optimization for Pathology Foundation Models

Title: Wavelet-Driven Masked Multiscale Reconstruction for PPG Foundation Models

Title: Learning Longitudinal Health Representations from EHR and Wearable Data

Title: Wavelet-Aware Anomaly Detection in Multi-Channel User Logs via Deviation Modulation and Resolution-Adaptive Attention

Title: DiffusionQC: Artifact Detection in Histopathology via Diffusion Model

Title: Plan, Verify and Fill: A Structured Parallel Decoding Approach for Diffusion Language Models

Title: Soft Shadow Diffusion (SSD): Physics-inspired Learning for 3D Computational Periscopy

Title: Multimodal Generative Engine Optimization: Rank Manipulation for Vision-Language Model Rankers

Title: AgenticPruner: MAC-Constrained Neural Network Compression via LLM-Driven Strategy Search

Title: SDiT: Semantic Region-Adaptive for Diffusion Transformers

Title: Conversational Context Classification: A Representation Engineering Approach

Title: S^2F-Net:A Robust Spatial-Spectral Fusion Framework for Cross-Model AIGC Detection

Title: Turbo-GoDec: Exploiting the Cluster Sparsity Prior for Hyperspectral Anomaly Detection

Title: Time-Continuous Modeling for Temporal Affective Pattern Recognition in LLMs

Title: From Prompts to Pavement: LMMs-based Agentic Behavior-Tree Generation Framework for Autonomous Vehicles

Title: DepthCropSeg++: Scaling a Crop Segmentation Foundation Model With Depth-Labeled Data

Title: LR-DWM: Efficient Watermarking for Diffusion Language Models

Title: Utilizing the Score of Data Distribution for Hyperspectral Anomaly Detection

Title: A Hierarchical Benchmark of Foundation Models for Dermatology

Title: Class-Partitioned VQ-VAE and Latent Flow Matching for Point Cloud Scene Generation

Title: Beyond the Dirac Delta: Mitigating Diversity Collapse in Reinforcement Fine-Tuning for Versatile Image Generation

Title: Graph Attention Networks with Physical Constraints for Anomaly Detection

Title: Encoding Emotion Through Self-Supervised Eye Movement Reconstruction

Title: Improving Low-Resource Machine Translation via Round-Trip Reinforcement Learning

Title: Disagreement as Data: Reasoning Trace Analytics in Multi-Agent Systems

Title: VILTA: A VLM-in-the-Loop Adversary for Enhancing Driving Policy Robustness

Title: S2DiT: Sandwich Diffusion Transformer for Mobile Streaming Video Generation

Title: DC-VLAQ: Query-Residual Aggregation for Robust Visual Place Recognition

Title: A Graph Prompt Fine-Tuning Method for WSN Spatio-Temporal Correlation Anomaly Detection

Title: SSPFormer: Self-Supervised Pretrained Transformer for MRI Images

Title: Moaw: Unleashing Motion Awareness for Video Diffusion Models

Title: Towards Unbiased Source-Free Object Detection via Vision Foundation Models

Title: Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image

Title: Distilling Time Series Foundation Models for Efficient Forecasting

Title: A Generalist Foundation Model for Total-body PET/CT Enables Diagnostic Reporting and System-wide Metabolic Profiling

Title: Knowledge-Integrated Representation Learning for Crypto Anomaly Detection under Extreme Label Scarcity; Relational Domain-Logic Integration with Retrieval-Grounded Context and Path-Level Explanations

Title: Generating Cyclic Conformers with Flow Matching in Cremer-Pople Coordinates

Title: PDFInspect: A Unified Feature Extraction Framework for Malicious Document Detection

Title: TwoHead-SwinFPN: A Unified DL Architecture for Synthetic Manipulation, Detection and Localization in Identity Documents

Title: GazeD: Context-Aware Diffusion for Accurate 3D Gaze Estimation

Title: StyMam: A Mamba-Based Generator for Artistic Style Transfer

Title: Cross-Scale Pretraining: Enhancing Self-Supervised Learning for Low-Resolution Satellite Imagery for Semantic Segmentation

Title: Deterministic Dynamics of Sampling Processes in Score-Based Diffusion Models with Multiplicative Noise Conditioning

Title: The Bitter Lesson of Diffusion Language Models for Agentic Workflows: A Comprehensive Reality Check

Title: Early Prediction of Type 2 Diabetes Using Multimodal data and Tabular Transformers

Title: PrivFly: A Privacy-Preserving Self-Supervised Framework for Rare Attack Detection in IoFT

Title: PhaseMark: A Post-hoc, Optimization-Free Watermarking of AI-generated Images in the Latent Frequency Domain

Title: CLIP-Guided Adaptable Self-Supervised Learning for Human-Centric Visual Tasks

Title: TVWorld: Foundations for Remote-Control TV Agents

Title: From 100,000+ images to winning the first brain MRI foundation model challenges: Sharing lessons and models

Title: LAViG-FLOW: Latent Autoregressive Video Generation for Fluid Flow Simulations

Title: Diffusion-Driven Synthetic Tabular Data Generation for Enhanced DoS/DDoS Attack Classification

Title: Autoregressive Models Rival Diffusion Models at ANY-ORDER Generation

Title: Spherical Geometry Diffusion: Generating High-quality 3D Face Geometry via Sphere-anchored Representations

Title: Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study

Title: SGW-GAN: Sliced Gromov-Wasserstein Guided GANs for Retinal Fundus Image Enhancement

Title: Analyzing VLM-Based Approaches for Anomaly Classification and Segmentation

Title: BladeSDF : Unconditional and Conditional Generative Modeling of Representative Blade Geometries Using Signed Distance Functions

Title: Preconditioning Benefits of Spectral Orthogonalization in Muon

Title: GO-MLVTON: Garment Occlusion-Aware Multi-Layer Virtual Try-On with Diffusion Models

Title: DiffFace-Edit: A Diffusion-Based Facial Dataset for Forgery-Semantic Driven Deepfake Detection Analysis

Title: Multi-objective fluorescent molecule design with a data-physics dual-driven generative framework

Title: Diffusion In Diffusion: Breaking the Autoregressive Bottleneck in Block Diffusion Models

Title: Towards Token-Level Text Anomaly Detection

Title: VIAFormer: Voxel-Image Alignment Transformer for High-Fidelity Voxel Refinement

Title: CommunityBench: Benchmarking Community-Level Alignment across Diverse Groups and Tasks

Title: Dynamic Differential Linear Attention: Enhancing Linear Diffusion Transformer for High-Quality Image Generation

Title: Who Should Have Surgery? A Comparative Study of GenAI vs Supervised ML for CRS Surgical Outcome Prediction

Title: HiT: History-Injection Transformers for Onboard Continuous Flood Change Detection

Title: Orthogonium : A Unified, Efficient Library of Orthogonal and 1-Lipschitz Building Blocks

Title: Principled Latent Diffusion for Graphs via Laplacian Autoencoders

Title: Insight: Interpretable Semantic Hierarchies in Vision-Language Encoders

Title: Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

Title: The Role of Prosodic and Lexical Cues in Turn-Taking with Self-Supervised Speech Representations

Title: FastGHA: Generalized Few-Shot 3D Gaussian Head Avatars with Real-Time Animation

Title: Inverting Self-Organizing Maps: A Unified Activation-Based Framework

Title: OCCAM: Class-Agnostic, Training-Free, Prior-Free and Multi-Class Object Counting

Title: Revisiting Multi-Task Visual Representation Learning

Title: Multi-Objective Hierarchical Optimization with Large Language Models

Title: VTONGuard: Automatic Detection and Authentication of AI-Generated Virtual Try-On Content

Title: RL-BioAug: Label-Efficient Reinforcement Learning for Self-Supervised EEG Representation Learning

Title: Likelihood-Separable Diffusion Inference for Multi-Image MRI Super-Resolution

Title: RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Title: Top 10 Open Challenges Steering the Future of Diffusion Language Model and Its Variants

Title: LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems

Title: POCI-Diff: Position Objects Consistently and Interactively with 3D-Layout Guided Diffusion

Title: VERIDAH: Solving Enumeration Anomaly Aware Vertebra Labeling across Imaging Sequences

Title: Interp3D: Correspondence-aware Interpolation for Generative Textured 3D Morphing

Title: Style Transfer as Bias Mitigation: Diffusion Models for Synthetic Mental Health Text for Arabic

Title: One-Shot Refiner: Boosting Feed-forward Novel View Synthesis via One-Step Diffusion

Title: Progressive self-supervised blind-spot denoising method for LDCT denoising

Title: IIR-VLM: In-Context Instance-level Recognition for Large Vision-Language Models

Title: Attention-Based Offline Reinforcement Learning and Clustering for Interpretable Sepsis Treatment

Title: Q-learning with Adjoint Matching

Title: Soft Tail-dropping for Adaptive Visual Tokenization

Title: VideoMaMa: Mask-Guided Video Matting via Generative Prior

Title: Implicit Neural Representation Facilitates Unified Universal Vision Encoding