2026-03-24

Title: Remote Sensing Image Dehazing: A Systematic Review of Progress, Challenges, and Prospects

Title: Transparent Fragments Contour Estimation via Visual-Tactile Fusion for Autonomous Reassembly

Title: MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG Discovery

Title: InjectFlow: Weak Guides Strong via Orthogonal Injection for Flow Matching

Title: Transferable Multi-Bit Watermarking Across Frozen Diffusion Models via Latent Consistency Bridges

Title: EARTalking: End-to-end GPT-style Autoregressive Talking Head Synthesis with Frame-wise Control

Title: Probing the Latent World: Emergent Discrete Symbols and Physical Structure in Latent Representations

Title: Uni-Classifier: Leveraging Video Diffusion Priors for Universal Guidance Classifier

Title: SymCircuit: Bayesian Structure Inference for Tractable Probabilistic Circuits via Entropy-Regularized Reinforcement Learning

Title: KV Cache Optimization Strategies for Scalable and Efficient LLM Inference

Title: Thinking in Different Spaces: Domain-Specific Latent Geometry Survives Cross-Architecture Translation

Title: PEARL: Personalized Streaming Video Understanding Model

Title: Understanding Behavior Cloning with Action Quantization

Title: Generating from Discrete Distributions Using Diffusions: Insights from Random Constraint Satisfaction Problems

Title: ScaleEdit-12M: Scaling Open-Source Image Editing Data Generation via Multi-Agent Framework

Title: Diffusion Model for Manifold Data: Score Decomposition, Curvature, and Statistical Complexity

Title: Exponential Family Discriminant Analysis: Generalizing LDA-Style Generative Classification to Non-Gaussian Models

Title: MFSR: MeanFlow Distillation for One Step Real-World Image Super Resolution

Title: Satellite-to-Street: Synthesizing Post-Disaster Views from Satellite Imagery via Generative Vision Models

Title: High-Quality and Efficient Turbulence Mitigation with Events

Title: Cross-modal Fuzzy Alignment Network for Text-Aerial Person Retrieval and A Large-scale Benchmark

Title: Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation

Title: VSD-MOT: End-to-End Multi-Object Tracking in Low-Quality Video Scenes Guided by Visual Semantic Distillation

Title: Memory-Efficient Fine-Tuning Diffusion Transformers via Dynamic Patch Sampling and Block Skipping

Title: ME-IQA: Memory-Enhanced Image Quality Assessment via Re-Ranking

Title: Predictive Regularization Against Visual Representation Degradation in Multimodal Large Language Models

Title: TAFG-MAN: Timestep-Adaptive Frequency-Gated Latent Diffusion for Efficient and High-Quality Low-Dose CT Image Denoising

Title: Beyond the Birkhoff Polytope: Spectral-Sphere-Constrained Hyper-Connections

Title: LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models

Title: Discriminative Representation Learning for Clinical Prediction

Title: GraPHFormer: A Multimodal Graph Persistent Homology Transformer for the Analysis of Neuroscience Morphologies

Title: Interpreting the Synchronization Gap: The Hidden Mechanism Inside Diffusion Transformers

Title: LPNSR: Prior-Enhanced Diffusion Image Super-Resolution via LR-Guided Noise Prediction

Title: Two Experts Are Better Than One Generalist: Decoupling Geometry and Appearance for Feed-Forward 3D Gaussian Splatting

Title: Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models

Title: MS-CustomNet: Controllable Multi-Subject Customization with Hierarchical Relational Semantics

Title: Incentivizing Generative Zero-Shot Learning via Outcome-Reward Reinforcement Learning with Visual Cues

Title: Training-Free Instance-Aware 3D Scene Reconstruction and Diffusion-Based View Synthesis from Sparse Images

Title: GIDE: Unlocking Diffusion LLMs for Precise Training-Free Image Editing

Title: Positional Segmentor-Guided Counterfactual Fine-Tuning for Spatially Localized Image Synthesis

Title: Does Mechanistic Interpretability Transfer Across Data Modalities? A Cross-Domain Causal Circuit Analysis of Variational Autoencoders

Title: Amortized Variational Inference for Logistic Regression with Missing Covariates

Title: Fusing Memory and Attention: A study on LSTM, Transformer and Hybrid Architectures for Symbolic Music Generation

Title: Focus on Background: Exploring SAM's Potential in Few-shot Medical Image Segmentation with Background-centric Prompting

Title: Text-Image Conditioned 3D Generation

Title: Identity-Consistent Video Generation under Large Facial-Angle Variations

Title: KHMP: Frequency-Domain Kalman Refinement for High-Fidelity Human Motion Prediction

Title: EmoTaG: Emotion-Aware Talking Head Synthesis on Gaussian Splatting with Few-Shot Personalization

Title: Efficient Coarse-to-Fine Diffusion Models with Time Step Sequence Redistribution

Title: Relax Forcing: Relaxed KV-Memory for Consistent Long Video Generation

Title: DSPA: Dynamic SAE Steering for Data-Efficient Preference Alignment

Title: Which Concepts to Forget and How to Refuse? Decomposing Concepts for Continual Unlearning in Large Vision-Language Models

Title: Learning Trajectory-Aware Multimodal Large Language Models for Video Reasoning Segmentation

Title: VIGIL: Part-Grounded Structured Reasoning for Generalizable Deepfake Detection

Title: From Part to Whole: 3D Generative World Model with an Adaptive Structural Hierarchy

Title: Revisiting Weakly-Supervised Video Scene Graph Generation via Pair Affinity Learning

Title: Riemannian Geometry Speaks Louder Than Words: From Graph Foundation Model to Next-Generation Graph Intelligence

Title: SARe: Structure-Aware Large-Scale 3D Fragment Reassembly

Title: AdaEdit: Adaptive Temporal and Channel Modulation for Flow-Based Image Editing

Title: Efficient Zero-Shot AI-Generated Image Detection

Title: Proximal Policy Optimization in Path Space: A Schrödinger Bridge Perspective

Title: TrustFed: Enabling Trustworthy Medical AI under Data Privacy Constraints

Title: OmniFM: Toward Modality-Robust and Task-Agnostic Federated Learning for Heterogeneous Medical Imaging

Title: Cross-Scenario Deraining Adaptation with Unpaired Data: Superpixel Structural Priors and Multi-Stage Pseudo-Rain Synthesis

Title: Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization

Title: When Exploration Comes for Free with Mixture-Greedy: Do we need UCB in Diversity-Aware Multi-Armed Bandits?

Title: Uncertainty Quantification for Distribution-to-Distribution Flow Matching in Scientific Imaging

Title: CellFluxRL: Biologically-Constrained Virtual Cell Modeling via Reinforcement Learning

Title: Show Me What You Don't Know: Efficient Sampling from Invariant Sets for Model Validation

Title: SHARP: Spectrum-aware Highly-dynamic Adaptation for Resolution Promotion in Remote Sensing Synthesis

Title: Dynamic Exposure Burst Image Restoration

Title: The Universal Normal Embedding

Title: Climate Prompting: Generating the Madden-Julian Oscillation using Video Diffusion and Low-Dimensional Conditioning

Title: Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation

Title: Manifold-Aware Exploration for Reinforcement Learning in Video Generation

Title: Deep S2P: Integrating Learning Based Stereo Matching Into the Satellite Stereo Pipeline

Title: Not All Layers Are Created Equal: Adaptive LoRA Ranks for Personalized Image Generation

Title: CLEAR: Context-Aware Learning with End-to-End Mask-Free Inference for Adaptive Video Subtitle Removal

Title: A Latent Representation Learning Framework for Hyperspectral Image Emulation in Remote Sensing

Title: MultiBind: A Benchmark for Attribute Misbinding in Multi-Subject Generation

Title: GeoFusion-CAD: Structure-Aware Diffusion with Geometric State Space for Parametric 3D Design

Title: Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model

Title: STENet: Superpixel Token Enhancing Network for RGB-D Salient Object Detection

Title: Tuning Real-World Image Restoration at Inference: A Test-Time Scaling Paradigm for Flow Matching Models

Title: DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation

Title: FontCrafter: High-Fidelity Element-Driven Artistic Font Creation with Visual In-Context Generation

Title: P-Flow: Prompting Visual Effects Generation

Title: FreeArtGS: Articulated Gaussian Splatting Under Free-moving Scenario

Title: DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment

Title: Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

Title: Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement

Title: PAM: A Pose-Appearance-Motion Engine for Sim-to-Real HOI Video Generation

Title: Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Title: Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models

Title: SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection

Title: Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting

Title: SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation

Title: Confidence-Based Decoding is Provably Efficient for Diffusion Language Models

Title: GenOpticalFlow: A Generative Approach to Unsupervised Optical Flow Learning

Title: DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution

Title: Repurposing Geometric Foundation Models for Multi-view Diffusion

Title: Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels

Title: UniMotion: A Unified Framework for Motion-Text-Vision Understanding and Generation

Title: End-to-End Training for Unified Tokenization and Latent Denoising