2026-03-27

Title: UniICL: Systematizing Unified Multimodal In-context Learning through a Capability-Oriented Taxonomy

Title: Training LLMs for Multi-Step Tool Orchestration with Constrained Data Synthesis and Graduated Rewards

Title: Lookalike3D: Seeing Double in 3D

Title: A Framework for Generating Semantically Ambiguous Images to Probe Human and Machine Perception

Title: Contrastive Learning Boosts Deterministic and Generative Models for Weather Data

Title: Synthetic Cardiac MRI Image Generation using Deep Generative Models

Title: AVControl: Efficient Framework for Training Audio-Visual Controls

Title: Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration

Title: Generative Adversarial Perturbations with Cross-paradigm Transferability on Localized Crowd Counting

Title: DCARL: A Divide-and-Conquer Framework for Autoregressive Long-Trajectory Video Generation

Title: Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models

Title: OptiSAR-Net++: A Large-Scale Benchmark and Transformer-Free Framework for Cross-Domain Remote Sensing Visual Grounding

Title: Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML

Title: GraphER: An Efficient Graph-Based Enrichment and Reranking Method for Retrieval-Augmented Generation

Title: TIGFlow-GRPO: Trajectory Forecasting via Interaction-Aware Flow Matching and Reward-Driven Optimization

Title: Infinite Gaze Generation for Videos with Autoregressive Diffusion

Title: BiFM: Bidirectional Flow Matching for Few-Step Image Editing and Generation

Title: Self-Corrected Image Generation with Explainable Latent Rewards

Title: PASDiff: Physics-Aware Semantic Guidance for Joint Real-world Low-Light Face Enhancement and Restoration

Title: GDPO-Listener: Expressive Interactive Head Generation via Auto-Regressive Flow Matching and Group reward-Decoupled Policy Optimization

Title: CARE: Training-Free Controllable Restoration for Medical Images via Dual-Latent Steering

Title: GaussFusion: Improving 3D Reconstruction in the Wild with A Geometry-Informed Video Generator

Title: SIGMA: Structure-Invariant Generative Molecular Alignment for Chemical Language Models via Autoregressive Contrastive Learning

Title: Z-Erase: Enabling Concept Erasure in Single-Stream Diffusion Transformers

Title: MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning

Title: MoireMix: A Formula-Based Data Augmentation for Improving Image Classification Robustness

Title: SEVerA: Verified Synthesis of Self-Evolving Agents

Title: AnyDoc: Enhancing Document Generation via Large-Scale HTML/CSS Data Synthesis and Height-Aware Reinforcement Optimization

Title: EgoXtreme: A Dataset for Robust Object Pose Estimation in Egocentric Views under Extreme Conditions

Title: FD$^2$: A Dedicated Framework for Fine-Grained Dataset Distillation

Title: Learning to Rank Caption Chains for Video-Text Alignment

Title: Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

Title: Vision Hopfield Memory Networks

Title: SportSkills: Physical Skill Learning from Sports Instructional Videos

Title: Bilingual Text-to-Motion Generation: A New Benchmark and Baselines

Title: VolDiT: Controllable Volumetric Medical Image Synthesis with Diffusion Transformers

Title: Knowledge-Guided Retrieval-Augmented Generation for Zero-Shot Psychiatric Data: Privacy Preserving Synthetic Data Generation

Title: AnyID: Ultra-Fidelity Universal Identity-Preserving Video Generation from Any Visual References

Title: CardioDiT: Latent Diffusion Transformers for 4D Cardiac MRI Synthesis

Title: Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction

Title: Efficient Preemptive Robustification with Image Sharpening

Title: Semantic-Aware Prefix Learning for Token-Efficient Image Generation

Title: Towards Controllable Low-Light Image Enhancement: A Continuous Multi-illumination Dataset and Efficient State Space Framework

Title: MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data

Title: CIAR: Interval-based Collaborative Decoding for Image Generation Acceleration

Title: RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models

Title: Beyond the Golden Data: Resolving the Motion-Vision Quality Dilemma via Timestep Selective Training

Title: BFMD: A Full-Match Badminton Dense Dataset for Dense Shot Captioning

Title: An Integrative Genome-Scale Metabolic Modeling and Machine Learning Framework for Predicting and Optimizing Biofuel-Relevant Biomass Production in Saccharomyces cerevisiae

Title: GeoHeight-Bench: Towards Height-Aware Multimodal Reasoning in Remote Sensing

Title: Wan-Weaver: Interleaved Multi-modal Generation via Decoupled Training

Title: Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs

Title: Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Title: PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Title: BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

Title: Unleashing Guidance Without Classifiers for Human-Object Interaction Animation

Title: How good was my shot? Quantifying Player Skill Level in Table Tennis

Title: Vega: Learning to Drive with Natural Language Instructions

Title: RefAlign: Representation Alignment for Reference-to-Video Generation

Title: ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling