2025-12-01

Title: SO-Bench: A Structural Output Evaluation of Multimodal LLMs

Title: Physics-Informed Spiking Neural Networks via Conservative Flux Quantization

Title: Saddle-Free Guidance: Improved On-Manifold Sampling without Labels or Additional Training

Title: Breaking the Illusion: Consensus-Based Generative Mitigation of Adversarial Illusions in Multi-Modal Embeddings

Title: PathReasoning: A multimodal reasoning agent for query-based ROI navigation on whole-slide images

Title: Adaptive Parameter Optimization for Robust Remote Photoplethysmography

Title: AmodalGen3D: Generative Amodal 3D Object Reconstruction from Sparse Unposed Views

Title: PAT3D: Physics-Augmented Text-to-3D Scene Generation

Title: StreamFlow: Theory, Algorithm, and Implementation for High-Efficiency Rectified Flow Generation

Title: PAGen: Phase-guided Amplitude Generation for Domain-adaptive Object Detection

Title: Predicting Public Health Impacts of Electricity Usage

Title: ICM-SR: Image-Conditioned Manifold Regularization for Image Super-Resoultion

Title: DNA: Dual-branch Network with Adaptation for Open-Set Online Handwriting Generation

Title: Convergence Dynamics of Over-Parameterized Score Matching for a Single Gaussian

Title: WorldWander: Bridging Egocentric and Exocentric Worlds in Video Generation

Title: MRI-Based Brain Age Estimation with Supervised Contrastive Learning of Continuous Representation

Title: PROMPTMINER: Black-Box Prompt Stealing against Text-to-Image Generative Models via Reinforcement Learning and Fuzz Optimization

Title: Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation

Title: DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

Title: EASL: Multi-Emotion Guided Semantic Disentanglement for Expressive Sign Language Generation

Title: IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer

Title: Partially Shared Concept Bottleneck Models

Title: BrepGPT: Autoregressive B-rep Generation with Voronoi Half-Patch

Title: Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage

Title: HybridWorldSim: A Scalable and Controllable High-fidelity Simulator for Autonomous Driving

Title: Controllable 3D Object Generation with Single Image Prompt

Title: From Compound Figures to Composite Understanding: Developing a Multi-Modal LLM from Biomedical Literature with Medical Multiple-Image Benchmarking and Validation

Title: IE-SRGS: An Internal-External Knowledge Fusion Framework for High-Fidelity 3D Gaussian Splatting Super-Resolution

Title: Toward Diffusible High-Dimensional Latent Spaces: A Frequency Perspective

Title: TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation

Title: The Collapse of Patches

Title: Match-and-Fuse: Consistent Generation from Unstructured Image Sets

Title: Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment

Title: INSIGHT: An Interpretable Neural Vision-Language Framework for Reasoning of Generative Artifacts

Title: DiffStyle360: Diffusion-Based 360° Head Stylization via Style Fusion Attention

Title: Wukong's 72 Transformations: High-fidelity Textured 3D Morphing via Flow Models

Title: Do You See What I Say? Generalizable Deepfake Detection based on Visual Speech Recognition

Title: Beyond Real versus Fake Towards Intent-Aware Video Analysis

Title: ITS3D: Inference-Time Scaling for Text-Guided 3D Diffusion Models

Title: Rethinking Cross-Generator Image Forgery Detection through DINOv3

Title: Adversarial Flow Models

Title: Enhancing Trustworthiness with Mixed Precision: Benchmarks, Opportunities, and Challenges

Title: AI killed the video star. Audio-driven diffusion model for expressive talking head generation

Title: SciPostGen: Bridging the Gap between Scientific Papers and Poster Layouts

Title: Space Explanations of Neural Network Classification

Title: What Shape Is Optimal for Masks in Text Removal?

Title: Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration

Title: Diff-ICMH: Harmonizing Machine and Human Vision in Image Compression with Generative Prior

Title: Bringing Your Portrait to 3D Presence

Title: GazeTrack: High-Precision Eye Tracking Based on Regularization and Spatial Computing

Title: Flow Density Control: Generative Optimization Beyond Entropy-Regularized Fine-Tuning

Title: Architecture Decoupling Is Not All You Need For Unified Multimodal Model

Title: Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Title: Test-time scaling of diffusions with flow maps

Title: Ar2Can: An Architect and an Artist Leveraging a Canvas for Multi-Human Generation

Title: Generative Anchored Fields: Controlled Data Generation via Emergent Velocity Fields and Transport Algebra

Title: Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Title: ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering

Title: VeriDispatcher: Multi-Model Dispatching through Pre-Inference Difficulty Prediction for RTL Generation Optimization

Title: Exact Learning of Arithmetic with Differentiable Agents

Title: Alzheimer's Disease Prediction Using EffNetViTLoRA and BiLSTM with Multimodal Longitudinal MRI Data

Title: From Pixels to Feelings: Aligning MLLMs with Human Cognitive Perception of Images

Title: LC4-DViT: Land-cover Creation for Land-cover Classification with Deformable Vision Transformer

Title: Captain Safari: A World Engine

Title: Resolving Evidence Sparsity: Agentic Context Engineering for Long-Document Understanding

Title: TARFVAE: Efficient One-Step Generative Time Series Forecasting via TARFLOW based VAE

Title: CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation

Title: Scalable Diffusion Transformer for Conditional 4D fMRI Synthesis

Title: Robust Image Self-Recovery against Tampering using Watermark Generation with Pixel Shuffling

Title: One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfe

Title: Do We Need Perfect Data? Leveraging Noise for Domain Generalized Segmentation

Title: BlockVid: Block Diffusion for High-Quality and Consistent Minute-Long Video Generation

Title: McSc: Motion-Corrective Preference Alignment for Video Generation with Self-Critic Hierarchical Reasoning

Title: MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation

Title: Guiding Visual Autoregressive Models through Spectrum Weakening

Title: Masked Diffusion for Generative Recommendation

Title: Evaluating the Clinical Impact of Generative Inpainting on Bone Age Estimation

Title: db-SP: Accelerating Sparse Attention for Visual Generative Models with Dual-Balanced Sequence Parallelism

Title: DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation

Title: InstanceV: Instance-Level Video Generation

Title: REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection

Title: Fast Multi-view Consistent 3D Editing with Video Priors

Title: GeoWorld: Unlocking the Potential of Geometry Models to Facilitate High-Fidelity 3D Scene Generation

Title: Vision Bridge Transformer at Scale

Title: Instruction Tuning of Large Language Models for Tabular Data Generation-in One Day

Title: Language-guided 3D scene synthesis for fine-grained functionality understanding

Title: Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods

Title: UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes

Title: Markovian Scale Prediction: A New Era of Visual Autoregressive Generation

Title: Flow Straighter and Faster: Efficient One-Step Generative Modeling via MeanFlow on Rectified Trajectories

Title: SimScale: Learning to Drive via Real-World Simulation at Scale

Title: VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction

Title: Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model

Title: Accelerated Execution of Bayesian Neural Networks using a Single Probabilistic Forward Pass and Code Generation

Title: ASTRO: Adaptive Stitching via Dynamics-Guided Trajectory Rollouts

Title: Visual Generation Tuning

Title: AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement