2025-03-14

Title: Inductive Spatio-Temporal Kriging with Physics-Guided Increment Training Strategy for Air Quality Inference

Title: LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics

Title: CoRe^2: Collect, Reflect and Refine to Generate Better and Faster

Title: Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Title: Accelerating Diffusion Sampling via Exploiting Local Transition Coherence

Title: DRESS: Disentangled Representation-based Self-Supervised Meta-Learning for Diverse Tasks

Title: Revisiting semi-supervised learning in the era of foundation models

Title: Revisiting Backdoor Attacks on Time Series Classification in the Frequency Domain

Title: The Pitfalls of Imitation Learning when Actions are Continuous

Title: I2V3D: Controllable image-to-video generation with 3D guidance

Title: Solving Bayesian inverse problems with diffusion priors and off-policy RL

Title: BiasConnect: Investigating Bias Interactions in Text-to-Image Models

Title: Constrained Language Generation with Discrete Diffusion Models

Title: Temporal Difference Flows

Title: Generative AI for Named Entity Recognition in Low-Resource Language Nepali

Title: Isolated Channel Vision Transformers: From Single-Channel Pretraining to Multi-Channel Finetuning

Title: Resolution Invariant Autoencoder

Title: Exploring Position Encoding in Diffusion U-Net for Training-free High-resolution Image Generation

Title: Foundation X: Integrating Classification, Localization, and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis

Title: Object-Aware DINO (Oh-A-Dino): Enhancing Self-Supervised Representations for Multi-Object Instance Retrieval

Title: CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation

Title: Inter-environmental world modeling for continuous and compositional dynamics

Title: Type Information-Assisted Self-Supervised Knowledge Graph Denoising

Title: VideoMerge: Towards Training-free Long Video Generation

Title: PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation

Title: A Chaotic Image Encryption Scheme Using Novel Geometric Block Permutation and Dynamic Substitution

Title: Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers

Title: UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

Title: X-Cross: Image Encryption Featuring Novel Dual-Layer Block Permutation and Dynamic Substitution Techniques

Title: Take Off the Training Wheels Progressive In-Context Learning for Effective Alignment

Title: Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes

Title: Investigating and Improving Counter-Stereotypical Action Relation in Text-to-Image Diffusion Models

Title: Multi-Modal Mamba Modeling for Survival Prediction (M4Survive): Adapting Joint Foundation Model Representations

Title: Provably Secure Covert Messaging Using Image-based Diffusion Processes

Title: Bayesian Prompt Flow Learning for Zero-Shot Anomaly Detection

Title: AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption

Title: Semantic Latent Motion for Portrait Video Generation

Title: Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear Extrapolation

Title: MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Title: Hybrid Agents for Image Restoration

Title: Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation

Title: PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models

Title: Robustness Tokens: Towards Adversarial Robustness of Transformers

Title: Singular Value Fine-tuning for Few-Shot Class-Incremental Learning

Title: CoStoDet-DDPM: Collaborative Training of Stochastic and Deterministic Models Improves Surgical Workflow Anticipation and Recognition

Title: Probability-Flow ODE in Infinite-Dimensional Function Spaces

Title: R.U.Psycho? Robust Unified Psychometric Testing of Language Models

Title: SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning

Title: An Open-RAN Testbed for Detecting and Mitigating Radio-Access Anomalies

Title: ROODI: Reconstructing Occluded Objects with Denoising Inpainters

Title: MaterialMVP: Illumination-Invariant Material Generation via Multi-view PBR Diffusion

Title: Generative Binary Memory: Pseudo-Replay Class-Incremental Learning on Binarized Embeddings

Title: DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image

Title: Enhancing Facial Privacy Protection via Weakening Diffusion Purification

Title: ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation

Title: Piece it Together: Part-Based Concepting with IP-Priors

Title: Probabilistic Forecasting via Autoregressive Flow Matching

Title: CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

Title: RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

Title: Hyper3D: Efficient 3D Representation via Hybrid Triplane and Octree Feature for Enhanced 3D Shape Variational Auto-Encoders

Title: RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models

Title: Understanding the Logical Capabilities of Large Language Models via Out-of-Context Representation Learning

Title: Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion

Title: Hoi2Anomaly: An Explainable Anomaly Detection Approach Guided by Human-Object Interaction

Title: Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression

Title: PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models

Title: DP-GPL: Differentially Private Graph Prompt Learning

Title: MASQUE: A Text-Guided Diffusion-Based Framework for Localized and Customized Adversarial Makeup

Title: Long Context Tuning for Video Generation

Title: CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

Title: MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction

Title: CoSTA$\ast$: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

Title: ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer

Title: DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation

Title: Transformers without Normalization

Title: NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models

Title: Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology

Title: HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Title: V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes

Title: Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective

Title: GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing