2025-06-12

Title: Enhancing the Safety of Medical Vision-Language Models by Synthetic Demonstrations

Title: BG-HOP: A Bimanual Generative Hand-Object Prior

Title: AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models

Title: LLM-ML Teaming: Integrated Symbolic Decoding and Gradient Search for Valid and Stable Generative Feature Transformation

Title: Spiking Neural Models for Decision-Making Tasks with Learning

Title: Integrating Asynchronous AdaBoost into Federated Learning: Five Real World Applications

Title: Bias Analysis in Unconditional Image Generative Models

Title: SensorLM: Learning the Language of Wearable Sensors

Title: CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model

Title: Seedance 1.0: Exploring the Boundaries of Video Generation Models

Title: TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

Title: Improving LLM Agent Planning with In-Context Learning via Atomic Fact Augmentation and Lookahead Search

Title: LaDCast: A Latent Diffusion Model for Medium-Range Ensemble Weather Forecasting

Title: A Technique for Isolating Lexically-Independent Phonetic Dependencies in Generative CNNs

Title: SoK: Machine Unlearning for Large Language Models

Title: Cross-Frame Representation Alignment for Fine-Tuning Video Diffusion Models

Title: PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies

Title: CFMI: Flow Matching for Missing Data Imputation

Title: G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration

Title: Learning The Minimum Action Distance

Title: What is the Cost of Differential Privacy for Deep Learning-Based Trajectory Generation?

Title: Alzheimer's Dementia Detection Using Perplexity from Paired Large Language Models

Title: MSSDF: Modality-Shared Self-supervised Distillation for High-Resolution Multi-modal Remote Sensing Image Learning

Title: Natural Language Guided Ligand-Binding Protein Design

Title: OmniDRCA: Parallel Speech-Text Foundation Model via Dual-Resolution Speech Representations and Contrastive Alignment

Title: Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

Title: SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing

Title: COGENT: A Curriculum-oriented Framework for Generating Grade-appropriate Educational Content

Title: Anomaly Detection and Generation with Diffusion Models: A Survey

Title: ScaleLSD: Scalable Deep Line Segment Detection Streamlined

Title: Revisiting Diffusion Models: From Generative Pre-training to One-Step Generation

Title: Improving Out-of-Distribution Detection via Dynamic Covariance Calibration

Title: PGDA-KGQA: A Prompt-Guided Generative Framework with Multiple Data Augmentation Strategies for Knowledge Graph Question Answering

Title: Noise Conditional Variational Score Distillation

Title: Securing Open RAN: A Survey of Cryptographic Challenges and Emerging Solutions for 5G

Title: Hidden in Plain Sight: Evaluation of the Deception Detection Capabilities of LLMs in Multimodal Settings

Title: GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture

Title: Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning

Title: Urban1960SatSeg: Unsupervised Semantic Segmentation of Mid-20$^{th}$ century Urban Landscapes with Satellite Imageries

Title: Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Title: HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene

Title: AngleRoCL: Angle-Robust Concept Learning for Physically View-Invariant T2I Adversarial Patches

Title: MEDUSA: A Multimodal Deep Fusion Multi-Stage Training Framework for Speech Emotion Recognition in Naturalistic Conditions

Title: Beyond Overconfidence: Foundation Models Redefine Calibration in Deep Neural Networks

Title: In-Context Bias Propagation in LLM-Based Tabular Data Generation

Title: FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models

Title: Using Sign Language Production as Data Augmentation to enhance Sign Language Translation

Title: DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning

Title: HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios

Title: Self-Supervised Multi-Part Articulated Objects Modeling via Deformable Gaussian Splatting and Progressive Primitive Segmentation

Title: CINeMA: Conditional Implicit Neural Multi-Modal Atlas for a Spatio-Temporal Representation of the Perinatal Brain

Title: Wavelet Scattering Transform and Fourier Representation for Offline Detection of Malicious Clients in Federated Learning

Title: TRIDENT: Temporally Restricted Inference via DFA-Enhanced Neural Traversal

Title: Towards Multi-modal Graph Large Language Model

Title: ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models

Title: Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints

Title: Accurate and efficient zero-shot 6D pose estimation with frozen foundation models

Title: A theoretical framework for self-supervised contrastive learning for continuous dependent data

Title: Generalizing Supervised Contrastive learning: A Projection Perspective

Title: Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation

Title: 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation

Title: EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks

Title: HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations

Title: Canonical Latent Representations in Conditional Diffusion Models

Title: ReSim: Reliable World Simulation for Autonomous Driving

Title: AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation

Title: EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits

Title: Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Title: Text-Aware Image Restoration with Diffusion Models