2025-03-14

Title: Inductive Spatio-Temporal Kriging with Physics-Guided Increment Training Strategy for Air Quality Inference

Title: CoRe^2: Collect, Reflect and Refine to Generate Better and Faster

Title: Accelerating Diffusion Sampling via Exploiting Local Transition Coherence

Title: Revisiting Backdoor Attacks on Time Series Classification in the Frequency Domain

Title: I2V3D: Controllable image-to-video generation with 3D guidance

Title: SASNet: Spatially-Adaptive Sinusoidal Neural Networks

Title: A PyTorch-Enabled Tool for Synthetic Event Camera Data Generation and Algorithm Development

Title: BiasConnect: Investigating Bias Interactions in Text-to-Image Models

Title: Temporal Difference Flows

Title: Resolution Invariant Autoencoder

Title: Exploring Position Encoding in Diffusion U-Net for Training-free High-resolution Image Generation

Title: On the Limitations of Vision-Language Models in Understanding Image Transforms

Title: LuciBot: Automated Robot Policy Learning from Generated Videos

Title: Inter-environmental world modeling for continuous and compositional dynamics

Title: Type Information-Assisted Self-Supervised Knowledge Graph Denoising

Title: VideoMerge: Towards Training-free Long Video Generation

Title: PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation

Title: Identifying Trustworthiness Challenges in Deep Learning Models for Continental-Scale Water Quality Prediction

Title: UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

Title: Exploring Mutual Empowerment Between Wireless Networks and RL-based LLMs: A Survey

Title: Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes

Title: Investigating and Improving Counter-Stereotypical Action Relation in Text-to-Image Diffusion Models

Title: FourierSR: A Fourier Token-based Plugin for Efficient Image Super-Resolution

Title: Compute Optimal Scaling of Skills: Knowledge vs Reasoning

Title: VMBench: A Benchmark for Perception-Aligned Video Motion Generation

Title: Image Quality Assessment: From Human to Machine Preference

Title: AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption

Title: Semantic Latent Motion for Portrait Video Generation

Title: Dream-IF: Dynamic Relative EnhAnceMent for Image Fusion

Title: MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Title: Hybrid Agents for Image Restoration

Title: Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation

Title: PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models

Title: Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

Title: Probability-Flow ODE in Infinite-Dimensional Function Spaces

Title: Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA

Title: ROODI: Reconstructing Occluded Objects with Denoising Inpainters

Title: KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception

Title: MaterialMVP: Illumination-Invariant Material Generation via Multi-view PBR Diffusion

Title: Towards Fast, Memory-based and Data-Efficient Vision-Language Policy

Title: IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Title: Generative Binary Memory: Pseudo-Replay Class-Incremental Learning on Binarized Embeddings

Title: DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image

Title: Enhancing Facial Privacy Protection via Weakening Diffusion Purification

Title: ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation

Title: Piece it Together: Part-Based Concepting with IP-Priors

Title: Probabilistic Forecasting via Autoregressive Flow Matching

Title: CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

Title: Hyper3D: Efficient 3D Representation via Hybrid Triplane and Octree Feature for Enhanced 3D Shape Variational Auto-Encoders

Title: RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models

Title: Learning Disease State from Noisy Ordinal Disease Progression Labels

Title: Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion

Title: Hoi2Anomaly: An Explainable Anomaly Detection Approach Guided by Human-Object Interaction

Title: Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression

Title: PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models

Title: MASQUE: A Text-Guided Diffusion-Based Framework for Localized and Customized Adversarial Makeup

Title: Autoregressive Image Generation with Randomized Parallel Decoding

Title: Long Context Tuning for Video Generation

Title: CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

Title: MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction

Title: DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation

Title: DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding

Title: Transformers without Normalization

Title: NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models

Title: HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Title: The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

Title: Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective

Title: GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing