2025-06-05

Title: Modular Diffusion Policy Training: Decoupling and Recombining Guidance and Diffusion for Offline RL

Title: Test-Time Scaling of Diffusion Models via Noise Trajectory Search

Title: PALADIN : Robust Neural Fingerprinting for Text-to-Image Diffusion Models

Title: Multimodal Foundation Model for Cross-Modal Retrieval and Activity Recognition Tasks

Title: Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward

Title: Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs

Title: Channel-adaptive Cross-modal Generative Semantic Communication for Point Cloud Transmission

Title: ConMamba: Contrastive Vision Mamba for Plant Disease Detection

Title: Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas

Title: The Future of Continual Learning in the Era of Foundation Models: Three Key Directions

Title: Robustness in Both Domains: CLIP Needs a Robust Text Encoder

Title: A Multimodal, Multilingual, and Multidimensional Pipeline for Fine-grained Crowdsourcing Earthquake Damage Evaluation

Title: A Foundation Model for Spatial Proteomics

Title: Adaptive Task Vectors for Large Language Models

Title: ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads

Title: Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection

Title: Measuring Human Involvement in AI-Generated Text: A Case Study on Academic Writing

Title: CHIME: Conditional Hallucination and Integrated Multi-scale Enhancement for Time Series Diffusion Model

Title: DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

Title: Path Generation and Evaluation in Video Games: A Nonparametric Statistical Approach

Title: Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting

Title: Learning Monotonic Probabilities with a Generative Cost Model

Title: KG-BiLM: Knowledge Graph Embedding via Bidirectional Language Models

Title: Automatically Suggesting Diverse Example Sentences for L2 Japanese Learners Using Pre-Trained Language Models

Title: From Understanding to Generation: An Efficient Shortcut for Evaluating Language Models

Title: Auto prompt sql: a resource-efficient architecture for text-to-sql translation in constrained environments

Title: Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation

Title: EmoArt: A Multidimensional Dataset for Emotion-Aware Artistic Generation

Title: INP-Former++: Advancing Universal Anomaly Detection via Intrinsic Normal Prototypes and Residual Learning

Title: How PARTs assemble into wholes: Learning the relative composition of images

Title: PRJ: Perception-Retrieval-Judgement for Generated Images

Title: Advancements in Artificial Intelligence Applications for Cardiovascular Disease Research

Title: On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity

Title: ConText: Driving In-context Learning for Text Removal and Segmentation

Title: Brain-tuned Speech Models Better Reflect Speech Processing Stages in the Brain

Title: DiffCAP: Diffusion-based Cumulative Adversarial Purification for Vision Language Models

Title: Lower Ricci Curvature for Hypergraphs

Title: Causality-Aware Contrastive Learning for Robust Multivariate Time-Series Anomaly Detection

Title: Solving Inverse Problems via Diffusion-Based Priors: An Approximation-Free Ensemble Sampling Approach

Title: Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter Era

Title: Explainability-Based Token Replacement on LLM-Generated Text

Title: Image Editing As Programs with Diffusion Models

Title: Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints

Title: Does Prompt Design Impact Quality of Data Imputation by LLMs?

Title: How to Use Graph Data in the Wild to Help Graph Anomaly Detection?

Title: Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector

Title: FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers

Title: Sounding that Object: Interactive Object-Aware Image to Audio Generation

Title: UNIC: Unified In-Context Video Editing

Title: Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation

Title: LayerFlow: A Unified Model for Layer-aware Video Generation