2025-06-12

Title: Enhancing the Safety of Medical Vision-Language Models by Synthetic Demonstrations

Title: BG-HOP: A Bimanual Generative Hand-Object Prior

Title: FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation

Title: AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models

Title: BakuFlow: A Streamlining Semi-Automatic Label Generation Tool

Title: LLM-ML Teaming: Integrated Symbolic Decoding and Gradient Search for Valid and Stable Generative Feature Transformation

Title: CUDA-LLM: LLMs Can Write Efficient CUDA Kernels

Title: Intra-Trajectory Consistency for Reward Modeling

Title: Bias Analysis in Unconditional Image Generative Models

Title: SensorLM: Learning the Language of Wearable Sensors

Title: Seedance 1.0: Exploring the Boundaries of Video Generation Models

Title: TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

Title: MultiNet: An Open-Source Software Toolkit \& Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models

Title: FedRAG: A Framework for Fine-Tuning Retrieval-Augmented Generation Systems

Title: Policy-Based Trajectory Clustering in Offline Reinforcement Learning

Title: SoK: Machine Unlearning for Large Language Models

Title: Agent-based Condition Monitoring Assistance with Multimodal Industrial Database Retrieval Augmented Generation

Title: G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration

Title: Natural Language Guided Ligand-Binding Protein Design

Title: CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation

Title: Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

Title: SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing

Title: Anomaly Detection and Generation with Diffusion Models: A Survey

Title: Revisiting Diffusion Models: From Generative Pre-training to One-Step Generation

Title: Synthetic Human Action Video Data Generation with Pose Transfer

Title: Noise Conditional Variational Score Distillation

Title: A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

Title: Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Title: Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs

Title: Consistent Story Generation with Asymmetry Zigzag Sampling

Title: In-Context Bias Propagation in LLM-Based Tabular Data Generation

Title: HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding

Title: FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models

Title: DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning

Title: HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios

Title: CINeMA: Conditional Implicit Neural Multi-Modal Atlas for a Spatio-Temporal Representation of the Perinatal Brain

Title: Towards Practical Alzheimer's Disease Diagnosis: A Lightweight and Interpretable Spiking Neural Model

Title: TRIDENT: Temporally Restricted Inference via DFA-Enhanced Neural Traversal

Title: ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models

Title: Accurate and efficient zero-shot 6D pose estimation with frozen foundation models

Title: DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision

Title: Only-Style: Stylistic Consistency in Image Generation without Content Leakage

Title: HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations

Title: Canonical Latent Representations in Conditional Diffusion Models

Title: Efficient Part-level 3D Object Generation via Dual Volume Packing

Title: AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation

Title: InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions

Title: EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits

Title: Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Title: Text-Aware Image Restoration with Diffusion Models

Title: PlayerOne: Egocentric World Simulator